Dr. Sharp: [00:00:00] Hello everyone. Welcome to The Testing Psychologist podcast, the podcast where we talk all about the business and practice of psychological and neuropsychological assessment. I’m your host, Dr. Jeremy Sharp, licensed psychologist, group practice owner, and private practice coach.
Hello everyone. Welcome back to The Testing Psychologist. I’m glad to be here with you as always.
I am thrilled to be back with some return guests. This is the second in a two-part series of sorts with Pearson staff, talking about digital assessment. In the first episode, about a month ago, we talked about the evolution of Q-interactive and the big picture of digital assessment. Today I’ve got Dr. Lisa Drozdick and Kristen Getz back to talk about more specifics related to digital assessment development.
We talk about the [00:01:00] scope and project management of digital assessment measures and how that differs from paper, all the variables that are measurable with digital assessment, and how to determine which ones are most important. We talk about usability and accessibility that gets taken into account with digital assessment. We spend a fair amount of time talking about gender neutrality and more inclusive assessment practices from the development side among many other things.
Before we transition, I’ll tell you a little about my guests. Dr. Lisa Whipple Drozdick is a licensed clinical psychologist, researcher, and Principal Research Director at Pearson. She received her clinical training at West Virginia University and the South Texas Veterans Healthcare System. She currently leads the development efforts on products assessing memory, executive functioning, cognitive ability, adaptive [00:02:00] functioning, and behavior. She also spearheads an initiative within Pearson focused on engaging with the research community, and she’s leading the development of the D-KEFS 2.0.
Kristen Getz is the Product Owner of Q-interactive. Her role is to define customer needs and business priorities for the Q-interactive development team. She’s been with Pearson for 10 years, first as a Research Director and then as a Product Owner. Prior to that, Kristen worked as a speech-language pathologist primarily in educational settings. So we get into the weeds a little bit with some of the technology and capabilities of digital assessment, but I hope you, like me, find this to be an engaging and informative episode.
If you’re a private practice owner or hopeful private practice owner, I always extend the invitation to those of you out there who’d like some support and accountability in a group setting. The Testing Psychologists’ mastermind groups are built [00:03:00] just for this. These are small groups of psychologists running testing practices who would like more accountability and support to grow your practice and reach those goals that you’re trying to reach. You can get more information at thetestingpsychologist.com/consulting and schedule a pre-group call.
All right, let’s get to my conversation with Lisa and Kristen about digital assessment development.
Hey, Kristen and Lisa, welcome back to the podcast.
Dr. Lisa: Thanks.
Kristen: Thank you. Great to be here.
Dr. Sharp: Yeah. Thanks for coming back. I’m always semi-amazed when people want to come back to [00:04:00] the podcast. I’m like, oh, okay, I didn’t totally scare them away the first time or two in some of these cases. I’m grateful to be here with you and I’m really excited to talk about our topic.
Last time, I guess this is a few weeks ago, maybe even a month ago, we talked about the evolution of digital assessment in Q-interactive specifically with the release of the new app. Today, we’re going to dig a little deeper into digital assessment in general and digital test development, which I am really excited to chat about because there’s so much that goes on behind the curtain with the development of these measures. I’d love to shine a spotlight on some of those processes. So, thanks for being here.
I’m trying to think, where do we start with all this? Let’s see [00:05:00] what would happen. What if we start at the very beginning? I would just love to hear what happens for y’all at the very beginning. When someone says, hey, what if we turned the WISC into a digital test? I don’t know how that goes, but what is the literal first step in a digital development process?
Dr. Lisa: Kristen’s going to look at that.
Kristen: […]Fair enough. Well, I feel like it really starts with the Research Director, so I feel maybe you should start and then I’ll fill in from there.
Dr. Lisa: The first answer is always in the research world, it depends, right? So in the example of a WISC where we were potentially doing both a paper and a digital version, it’s a different set of questions than we would ask if there was something that we were building just fully digital. For example, on a WISC, we start asking questions on [00:06:00] how do we make an equivalent product in the digital world that can be used and could take advantage of all the digital elements but still stay true to the paper version because at this point the paper is still the driver of development when we’re doing both.
And at some point that will tip and we’ll be asking the opposite question, like, how do we make a paper equivalent to the digital? At this point, we’re not quite there yet, though I see that might be the next revision cycle. I’m not sure.
But then we are building our first digitally native test. It’s the D-KEFS digital. And for that, it was really, what is this construct? What are the advantages of building it in digital- that would be a reason to do it digitally native versus paper? And then it starts, well, what are the constructs that we’re measuring? What are the key elements of tests that are [00:07:00] currently out there that we need to keep and retain in the digital world?
That begins the very intense process of breaking down every single element that you use in a test and figuring out what would change in a digital versus a paper. And then how do we take advantage of the things you can’t do on paper that you can do on digital? So, it’s a very intense process that… for D-KEFS at least, it’s taken many years to do so.
Dr. Sharp: Yeah. Even that statement of breaking down the test seems like there are so many variables to try to capture. There’s the clinical component and what is actually being measured with the test but then there’s the input, what people see or touch. There’s the way [00:08:00] that they see or touch or interact with the test. There’s the timing, there’s the score. There’s so many variables it seems like to try and break down in the beginning.
I’m curious if y’all at this point, have you developed some, I don’t know if you call it a rubric or what, to guide that whole process? Do you know at this point, okay, if we’re going to try to devise a digital test, we need to look at boom, boom, boom, boom, boom? Like these five areas or departments or whatever it may be. That may be a dumb question. Y’all can say that if you’d like.
Kristen: Not at all. I think that we wish there were five things that we could just say we have to look at these five things, mark them off and we did five, and now we have the answer. We can move on. But it really is different with every test, with every task or with every measure, with every construct that we’re looking at. I think that a lot of [00:09:00] times our questions are different and the way that we approach it are slightly different.
We definitely have an iterative process for digital test development now, where we’re going through our CogLabs. We’re trying to do things with very simple wireframes at first. Then we’re developing prototypes, that sort of thing. With the variables that we’re trying to measure, then those become a little bit stickier with clinical assessment because obviously, nothing can be perfect with a paper wireframe or some early prototyping software that we may use like extras or something else. But we try to do the best that we can early on because we want to fail early and make changes so that we can react to them and develop the most clinically sound product that we can.
Dr. Sharp: That makes sense.
Dr. Lisa: From the content side, it’s interesting to think through, you [00:10:00] did a little mini breakdown right there of what do we do? But even with a simple matrices task, it seems fairly simple but you have to examine our visibility, are they going to touch the screen? If they touch the screen, is it better with a stylus or a hand? All of that is thought through. The size of the stimuli in the iPad. The color contrast. Luckily the iPads are pretty consistent across, which is one of the reasons that we’ve stuck with iPads for so long, is because they’re very consistent across devices. But then comprehension of the instructions, are the instructions read by the examiner, or are they played by the system?
Every single one of those is tested and tried and just probably beyond, we could try and think of every single potential edge case when we’re doing that. I think Kristen’s [00:11:00] team and the team of UX designers who work with us are very good at coming up with edge cases. Like, well, what if you have someone who is colorblind? What do you do then? All of that gets thought of in those early processes.
And it’s pretty intense those early, just ironing out anything. And you have to think about it, not just from the examinee side but what is the examiner doing during that entire interaction? What’s their recording process look like? What are the requirements on them even? And colorblindness is another one there that frequently, the size of the font, you have to make a font change for the speak font. It can’t just pick color. So, lots of nuances.
Dr. Sharp: Oh, sure. I wonder if we could possibly even back up a little bit. I don’t know if this is reasonable, you can tell me, but are you able to outline just the basic [00:12:00] scope and research plan when you are developing a test? What does that generally look like? You know what I mean? What’s the beginning? What are the key points along the way? What’s actually happening big picture when y’all are doing this?
Dr. Lisa: Well, the initial stages of the scoping and there is a full scoping and research plan from the get-go. It’s typically starts with what are the goals of the project? Just sitting down and laying out what is it that we’re wanting to change or update or modify.
Sometimes with a norm update, it’s just a norm update. It’s a very simple goal. But sometimes where we’re developing a brand new test, that is a very extensive scoping. Why are we building this test? What is the audience? Who are going to be the users? The end you both the examiners and the examinees, so that we can build for those requirements because it’s very different in a workplace setting where you might be [00:13:00] getting something versus a preschool, or having to do work within a nursing home situation.
So we have to think through who are those users? And then you from those goals, start building out. If the goal is to provide a cognitive assessment of memory in adults, let’s just say, in 18 to 90, what are the constructs within that that you want to then measure? Is it a time limited thing? Is it a verbal assessment? Is it a visual? And you just walk through all of that.
Once you have that general idea, if we’re going to build it digitally native, the engineers and Kristen’s teams all come in and say, yeah, they look at what the plan is and say, well, that’s potentially doable this way. We talk to each other about, okay, this would be the ideal dream, can we get the extreme [00:14:00] video enhanced VR experience or do we need to tame that down a little bit?
And Kristen’s team is good. Okay, this is really good. We can pick it up over here. We can build this. And we just run through features of what might be enhancements to a particular product. Once we get the general idea, everybody in the group weighs in on the scope and how do we get there? And then we capture it through a research plan of how we’re going to get there.
Dr. Sharp: I got you.
Kristen: I think there’s a lot of times, I’m sorry, go ahead, Jeremy.
Dr. Sharp: Well, I just wanted to clarify for folks quickly. And in case they may not know, I don’t know that I fully know actually, the relationship between research director and product manager, who’s doing… I’ll almost get the sense research director is the vision and then product is the implementation or the [00:15:00] building or the architecting. I’d be curious just how y’all would describe the relationship between those two departments and how they interact on a project like this.
Dr. Lisa: You want me to take that one? I’m on the research side. We drive the construct development, the item development, the statistics, the all of the pieces that cut, think about the technical manual and the administration part of the administrative manual are really the research drive. So anything that goes into that and the logistics of that is all us.
We have product managers and we also have project managers. There’s a weirdness there. I think what you were describing were our project managers where they keep an eye on schedule, budget and making sure that tasks get done. Logistically, they’re ensuring that everything is resourced appropriately and making sure that things are happening when they should be happening and doing a [00:16:00] lot of cross-group communications to keep all groups who may not currently be working on it aware of delays in the project or changes to the product projects.
We have a separate group that is a product manager. They’re guides for our content and construct domain. Like we have a product manager for the neuropsychology product line who looks across all of the products and says, hey, we might have a hole over here or I’m getting some submissions for product ideas from over here. And then brings the team together to think about how do we address those types of customer raised concerns or guide that goal setting of the whole project when we’re doing revisions.
So they’re going to be leading customer focus groups, customer surveys, all of that, and synthesizing sales information, feedback from all of our different customer service groups and our sales groups. Our digital group [00:17:00] often gives a lot of information. They’re like, these are the concerns that customers have raised about the previous version. And then with those, get codified into goals that the team then needs to address in the research part of it.
So they’ve the visionaries but a broader visionary than a product specific visionary, if that… but I like that term though. I may use visionary in my title somewhere.
Dr. Sharp: Nice. Take that for sure.
Kristen: But the product owner specifically for the digital platforms is responsible for managing and prioritizing the backlog. My job is to guide the business decisions down through the what we call our Scrum teams, the teams, quality assurance engineers, the business analysts, all the people that are actually building the product digitally once we give them guidance, making sure that we’re actually executing the goals of the business. [00:18:00] I work very closely with the different product managers, but I’m specific to the digital platforms.
Dr. Sharp: Got you. Thanks for sharing that and explaining a little bit. So getting back to the test development and the project itself, I’m curious how y’all have found, I guess, the complexity. It seems to me like complexity on digital development would be a lot more intense. First of all, has that been your experience or no?
Dr. Lisa: I think it’s differently intense.
Dr. Sharp: Okay.
Dr. Lisa: Does that make sense?
Dr. Sharp: Yeah.
Dr. Lisa: I’m trying to think of my experiences with paper. It’s different. I think some of the early experiences we had, Kristen, probably about 10 years ago when Q-interactive launched was communication differences. I [00:19:00] don’t think that that happens as much anymore. I think the digital side has learned some of the clinical speak and the clinical side has learned some of the digital speak, but early on there were definitely times when we could all be in the same room having the same conversation and using words that we all thought one meant one thing, and then walked out of the room and everybody had a different impression of what the outcome and decision was. I should have spent a little time thinking of an example of that but I know there were several times we come back in and everybody had done all the work and completely different.
Dr. Sharp: Just completely different.
Dr. Lisa: We’re like, wait, how did you get there? And they’re like, how did you get there? And we all thought, this is weird. How did we both go out of here? I think those kind of language, I think we’re better about, okay. What do you mean by that? What are you understanding I’m saying, has been more of a conversation, but I think that that’s less so now. It’s just [00:20:00] a different kind of complexity.
Dr. Sharp: Yes.
Kristen: It might be even, I’m going to speak for you on this one, Lisa, but you and I think it’s more interesting to develop digitally. I think that there’s a lot of newness to it and a lot of things that we couldn’t do on paper that become very exciting for us to explore digitally. So in that regard, it’s probably more difficult if we had to break it down but it’s a good difficult. It’s a good problem to have.
Dr. Lisa: I think there’s more people when you do a digital project than when you do a paper project. So I think that definitely adds complexity too.
Dr. Sharp: Sure.That may be a nice segue actually, Kristen, you mentioned this idea that you can do some really cool things with digital that are not possible on paper. Maybe we could talk about some of [00:21:00] those really cool things.
Kristen: Yeah, when we think about a lot of our revisions, when Lisa mentioned, I was going through the whole process of how we go through thinking about what problems we’re trying to solve for the customer and what goals we have, a lot of our things around digital of course, are the things that people like most about it. They want less manipulatives. They want shorter administration times. They want more data. They want all of these things that we start to really explore.
Those are all things we just sit down and have a conversation like, gosh, Lisa, I’m thinking about one that we haven’t talked about maybe in a while, like the WMSdesigns. So trying to get rid of what do we do to make that administration better for both the examinee and the examiner? What could we do for that?
We start having the conversation of, well, okay, could we get rid of the [00:22:00] grid? Could we get rid of the cards? What would that look like? And those become very interesting conversations between the research and the engineering group. Like what can we do in those regards?
Dr. Lisa: And it’s really intriguing breaking down pieces into what is it that we’re actually wanting to measure versus extraneous variables that are introduced because of the way that we measure things. For example, they’re just on the WMS Designs test.
We’re obviously looking at spatial memory. That’s the construct we’re wanting to measure. In the actual paper world, we’re also measuring the person’s ability to pick up the cards, look through the cards. So there’s some fine motor skills there. Ability to put it in the cards, ability to translate from visual [00:23:00] to grid movement and then know when they’re done and respond to feedback from the examiner if they put too many in there, like, no, you can’t do that whereas on digital, some of those pieces of the fine motor skills of handling and manipulating those materials aren’t there. There is still some movement around the screen, but it’s a lot less requirement on the examinee.
It also takes away a huge amount of the recording issues for the examiner who’s trying to lift the back of the thing to see where they placed the cards, put it in the correct orientation in the record form, then use the record form to either translate it to a scoring assistant or try to score it there in the record form. So for them it’s a much easier experience, the examiner, because all of that is removed because the examiner is responding directly on the interface.
And then the question is, okay, are we still [00:24:00] getting a spatial memory? We are but there’s definitely response processes around it that are different across the two. And so that’s when we do the equivalent studies to make sure that the norms on one do the other. But if we didn’t have a paper in a digital, could we have reenvisioned what that test could be from the get go, let alone try to translate it from that paper world. Maybe there’s something else you can do in a digital space with an interface where it’s getting at spatial memory, but not in that grid style. So that’s the next step kind of stuff that we get to do.
Kristen: Yeah, that’s a really exciting stuff. When you say, how can we measure that in a more pure form, take out all of those extraneous variables. That’s really what we are hoping to get to when we cannot be as tied to paper, when we can really think separately about these things.
Dr. Sharp: Right. [00:25:00] Have you found that it’s been tough to separate yourself from the paper norm? I think that would be very challenging. Even if I said, I’m going to forget everything and create something totally new, it would be very hard not to have the paper version in your mind as a starting point or as an influence. Have y’all found that to be true?
Kristen: Yeah, I think so, for sure.
Dr. Lisa: I think that’s also just the nature of research because you’re building on what’s been out in the research for a long period of time and it does take adjustment when new types of tasks come out. And so there’s also that piece of, we got to stay within the research bound so that it makes sense to people, but that also be on that exploratory and innovative side, which is where you start seeing optional tests pop up on various [00:26:00] measures. Like, hey, try this, and ideas of mixing paper and digital, because there are definitely purposes for paper. Paper is still a driver in many of these manipulating things. Is important in certain types of evaluations but in other evaluations, it’s just noise.
Dr. Sharp: Well, and it’s a cool process to be able to really evaluate and go through and figure out where it is just noise and where it’s absolutely necessary. I don’t know. I’m making a lot of assumptions because I don’t do the work that y’all do. I just get to use the products at the end.
It’s easy to look from the outside and say, man, we’ve been a little complacent over the last a 100 years of basically doing the same tests for a long time. And it’s really nice to think about this fresh approach and to know that we have the [00:27:00] technology to revisit a lot of these classic measures and do things differently.
Kristen: Yeah, I think we’re seeing a lot of change in that regard as far as movement. We talked to you a little bit about this in our last conversation with you, Jeremy, as far as early on, people were so tied to the paper and making sure that they were 100% in control of that test administration and they wanted to do everything right, write all responses, score everything, mark every contextual event, every behavioral observation.
But using Q-interactive as an example, 10 years later, we’ve seen this shift of people saying, all right, I don’t need to do everything. If the interface can do some of this and can do it more accurately and faster and provide more information, everything, if that can happen, then I want that to happen because that makes [00:28:00] for a better assessment for myself, for my person I’m my examining. And it also provides better results, more accurate results.
So we’re seeing the switch and it’s same thing with the new tests that are being imagined and being worked on right now. They said, when we put D-KEFS in front of people and they see a fully digital version of some of these classic measures, and they’re like, absolutely, this is amazing. This is what we need. This was the next step. We’re seeing this arc and what people in our fields are seeing as to what’s next for test development.
Dr. Sharp: Sure. Well, I know we talked a little bit in the last conversation just about the the nuances and the different variables that we can get at with digital measures but just the potential for more precise timing and [00:29:00] error analysis and things like that. It’s really exciting.
I want to maybe selfishly go back to some of the nuts and bolts of this development. Maybe just to give y’all a chance to talk about all the work that goes into digital test development. We know there’s equivalency studies, you have to tackle that but I’m even interested, I would love to hear y’all talk about all the other aspects of it in terms of how do you decide on the motor aspects of the digital test or the stylus versus the finger you mentioned that. I’m like, oh, that’s a thing you have to think about? How do you even start to think about that? Or digital interface. And then we’re getting into user design and all these elements. I would love to touch on each of those and just hear how you approach development in those [00:30:00] areas.
Kristen: Maybe we can use one as an example and just talk it through, like the stylus one is a good example because we can just use that. We could give you hundreds of different examples of different response types or hardware and things like that. The stylus becomes one that’s interesting because anytime we develop anything, we have to be very careful to say like, is this something that we were going to say, okay, you had to use a stylus. Does it have to be this stylus?
We really don’t want to tell people that because the stylus that you have this year is not going to be the stylus that you have next year. So we have to make sure that what we’re saying to somebody is sustainable for the life of the product and make sure that if we are going to tell you it has to be a certain way that we provide some information as to why we think that.
Lisa, do you want to talk? We literally, when we talk about stylus, the research directors [00:31:00] go out and purchase a bunch of different kinds of styli and bring them in. And we have different people, different age groups, little kids, we’ve used real fat ones for kids that used whoopsies and we’ve had ones with flat tips. We literally look at all different kinds to see if it makes a difference with the response on the iPads.
Dr. Lisa: Yeah. And then we don’t standardize with any particular one unless we find, well, we may say don’t use this kind if we see one that has significant problems but the question on whether they use their finger or the stylus, and I have a feeling that will probably be test by test because it’s hard with preschoolers to get them to even hold a stylus, let alone use it appropriately. So my guess is, as we develop more tests for little ones, there’s going to be more finger. But [00:32:00] you have to think about the surface of an iPad and how big that is. It’s not a very big surface actually.
For as much as we use it, how ubiquitous it is to us, there’s not a whole lot of real estate there. So whenever you touch something, when you use paper, people lay their hands on it and write in different ways, and it doesn’t cause any problems at all in paper unless maybe there’s something sticky on their hand or something. With an iPad, if you lay your hand on it, it registers all the various touch points from that.
And there are Apple native, and this is true of all tablets, all tablet natives have multi-touch reactions to things. Oh wait, they’re touching with three fingers. That means they want to minimize. Or they’re doing something. And if you do that with a hand, all of a sudden it’s much more difficult to control how the device is going to respond [00:33:00] to the person than a single touch. And you can’t keep somebody with one finger touching very easily. You also cover part of the screen when you’re using your hand directly on the interface.
We watched on quite a few early versions of the D-KEFS pilots. We spent a lot of time watching how people interacted with the iPad. Interestingly, also watched how the examiner was interacting with the iPad and learned a lot there too. We did it with eight-year-olds. We did it with 90-year-olds. We did it with people with motor difficulties. We did all sorts of watching just to see what issues would be caused. We came to the conclusion that it was minimized when they were using a stylus.
It just seemed to complicate everything when we let them just use their fingers. Either their fingers didn’t have good conductivity, particularly in [00:34:00] older adults, that’s a huge difficulty. And so the computer, it wouldn’t register their touch at all. Kids who would wipe their nose and then touch the iPad and it’s like there’s something on the iPad that’s interfering.
There’s just tons of stuff that happened that when the stylus, even though that stuff can still get on the stylus, it’s not affecting the touch state. So we’ve made the decision on that particular product to go with using a stylus because it doesn’t cover. It minimizes the difficulty of making sure you have conductivity and keeps you off the screen so you can see everything.
Dr. Sharp: Yeah, so…
Kristen: Hardware’s really…
Dr. Sharp: Go ahead, Kristen.
Kristen: Hardware’s just a really interesting topic in general for us to keep up with because obviously we’re at the mercy of Apple to as a large extent as they make changes and they move where the home button maybe or [00:35:00] they change at one point, the auto brightness, they added that feature to it. And what do we do that for our test development process, do we have to change settings and that sort of thing.
So it’s something that we constantly have to keep up on. So we’re always getting the newest hardware ahead of time so we can test it in-house and start to say like, what is different about this particular model? Or somehow Lisa mentioned that resting your hand on it sometimes they change the screen size and things like that.
There’s very subtle differences that we have to keep up with. So we do try to make notes of that if we’re actually going to make something when we’ve always been able to keep up with it. Say like, okay, this will work for Q-interactive, but you may have to change this particular setting or whatever it is.
Dr. Sharp: Yeah. Have you all given any thought to moving toward the [00:36:00] Apple pencil as an input device across the board versus a stylus or finger?
Kristen: Yeah, we haven’t mandated the Apple pencil, but it hasn’t been necessary for us to say that no, you can. I use an Apple pencil. I think it writes. Obviously, it’s made for the device and it writes really well. We haven’t said that you must or can or can’t use it and we’re very conscious of what the cost it is for examiners to have two iPads. I’m very aware of that.
Dr. Sharp: Absolutely.
Dr. Lisa: I think we’ve found a lot of examiners really like it because they’re capturing written responses. They’re writing all the verbatims down still. You got to know your patient population. Are you going to hand them your expensive apple pen, pencil every time? That’s not a cheap one [00:37:00] to replace regularly. So we’ve tried to stay away from mandating and standardizing only with that.
Dr. Sharp: That’s reasonable. I’m going to make a broad generalization and maybe step into some hot water but you’ve mentioned older adults two times. There is clearly a stereotype of older adults not being able to manage technology as well as younger individuals. I would have to think that this was a large consideration with some of the measures that you’re developing.
So maybe first, is that true? And if so, how do you navigate that when you have maybe an age range that doesn’t do well? I guess we could take it to the other side as well, like little kids also don’t do super well navigating technology. There are considerations across the board.
Kristen: Sure. Two extremes. You take the older and I can talk about the little younger. [00:38:00]
Dr. Lisa: Okay. First, not to generalize just to older adults, but it’s really anyone who hasn’t had a lot of technology experience or experience with iPads. There tends to be a larger number of those in the older populations, but we also see it in some of the younger who have never interacted with that. They’re like, I just don’t want to touch that. That’s not my thing.
We did do a lot of work early on with older adults, particularly with the WMS, the design piece, we put that in front of a lot of older adults and we also with a lot of the D-KEFS work. That’s where all these early prototypes, we did a ton of stuff with those. And there we have a trail making task on the new D-KEFS, if you do the paper version, you have to draw lines.
And so early on we really wanted to emulate that experience. And we thought, okay, they just need to draw these lines. It [00:39:00] was incredibly obvious after probably, I’d say maybe five people drawing. What do you think? They were like, we can’t do this on the iPad. The range that they can hit a circle, there’s got to be a tolerance on it. And people were accidentally hitting all these circles while trying to connect even unintentionally.
The new version or the version that we moved to after we tried multiple things to try and get the drawing lines to work was just a tap interface. I think one of the things that we’ve learned is the simpler the interface, the less anxiety and stress you see on the examinees in general.
Okay. Yeah, I can tap circles. That’s not difficult as opposed to drawing lines where you may waiver or you may lift your pen and now you’re trying to get it back on the exact same spot. It was just much easier. I think [00:40:00] that also during our standardization and during any of the times that we’re collecting digital, we encourage examiners to spend time habituating their person to the iPad surface. They can move it.
Originally we saw people, once you put it down in front of them, unlike a piece of paper where they feel like they can move it around the entire table, turn it upside down, they didn’t want to move the iPad. They didn’t touch the iPad except with the stylus we gave them. And so we were like, okay, make them move the iPad to where it’s comfortable. Demonstrate that you can move the iPad around. Let them use the notes function that is in every single iPad to practice writing and drawing and ways they hold their stylus so that you can address those and let them practice as long as they need to before they feel comfortable with it.
And then limit the amount of fine motor skills that are going to be needing across those like tapping instead of [00:41:00] drawing. Drag and drop. If we have one task that has a drag and drop feature, practice it as long as you need to before that person moves onto the test items because you don’t want my discomfort with drag and drop to be influencing my scores.
So just encouraging a lot more practice on the practice items. They’re not just one and done anymore. They really are until they’re comfortable. If you notice that there’s a stylus problem, take a minute and just let them repeat it. And so we found that to be most of our older adults. We can hear them, they’ll say, I don’t like the iPad but then they’re interacting with it just like everybody else does. But they’ll voice a lot more concern, interestingly.
Kristen: I think it goes a lot to accessibility too, which is a very big topic for us. If you make something that’s accessible, especially accessible [00:42:00] to those age ranges at either at the low-end or high-end, good accessibility means good user experience and good design for everybody for all ages.
So when Lisa was talking about, we were at first, and again, it goes back to that paper, we were thinking trails must draw the line from A to B. Well, do you really need to draw the line from A to B because like Lisa said, that that’s becomes, you’re introducing a lot of extra factors there, a lot of extra noise that it’s easier for an eight-year-old to tap A and B, and it’s easier for a 50-year-old to tap A and B, and it’s easier for a four-year-old to tap A and B.
I think those are the things that we’re learning as we go on. If we can make it accessible to everyone within the population, then it’s just good design in general and it makes for a better assessment experience.
Dr. Sharp: Right.
Kristen: I was going to say the one thing maybe that people don’t know about the younger kids in [00:43:00] Q-interactive is a behavior that we saw quite frequently with the little ones because they’re so used to using an iPad. And then the way that they’re taught to use it is, touch everything on the screen because if I touch something, it does something. So when we first introduced Q-interactive to the youngest kids, their only experience for those types of applications where I touch the cow, the cow moos, the cow walks to the chicken, that sort of thing that they’re looking for.
Even the reaction of a gray touch state, which is what they saw when they touched a response on Q-interactive, that was enough for them to say, okay, I must touch everything on there because I want to see this gray circle. That was enough feedback for them. So for our youngest kids, we decided, they really don’t need to see the gray circle. They don’t need the feedback of, I touched something and I see the response, whereas that is important to different age groups.
For our very youngest kids, we [00:44:00] disabled the touch state for kids four and younger. So if you, for example, have a PPVT and you say, touch the dog, and they touched the dog, they’re not seeing that touch tape because that’s what drove them to touch everything on the screen.
Dr. Sharp: That is super interesting and one of the little variables that I maybe would not think about, but that just comes from a lot of observation. A really cool factor to notice and nice to have the flexibility to turn that on and off depending on the age. I like this idea of usability and accessibility, are there any other examples or components that you can think of that are worth highlighting through this process and really looking through that usability lens?
Kristen: Well, I think it’s something that we [00:45:00] were looking at. It’s a very important topic to us as far as accessibility for both the examiner and the examinee and it’s different. It’s a different conversation depending on who we’re talking about. If it’s accessibility for the examiner, what can we do? Because a lot of things we can do for the examiner don’t really impact the test administration process whatsoever.
We can change the contrast on the iPad or we can enlarge font for them. We can do all of those things. When we start to talk about flipping it over for accessibility for the examinee, then we have a different conversation because we’ll have lots of conversations within the research group about if we do change this factor, then all of a sudden are we invalidating what we’re trying to test here.
For example, we have lots of reviews that are done on our content and they’ll say, okay, for the reading test you should be able to read the reading test, but you can’t read the reading test or then we can’t…
Dr. Lisa: Oral comprehension [00:46:00] test.
Kristen: Right, then we’re measuring something totally different. So we really looked at each of the groups to see what can and can’t we do for a particular group. If it has, for example, if it’s a motoric test, if they have gross motor issues, we probably can’t do anything in that regard because that’s what we’re trying to measure.
But could we read something? Well, sure. We could read something in that regard. Could we make something and not read aloud? That becomes part of the scope that we originally talked about. So this idea of accessibility. So we go through each of the specific groups that we’re talking about and each of the tasks or the sub-tasks that we’re creating and say, for this group, let’s say it’s individuals that are blind, what can and can’t we do for them and still measure what we’re trying to measure.
Dr. Sharp: I was going to ask about that. I’m glad you brought that up. [00:47:00] Does this give more flexibility? It sounds like it does, but does it give more flexibility to accommodate vision impairment? That seems like a really big one. That’s been a limitation for a lot of our measures. Or does it give more flexibility to design tests that might be a little more accommodating if we’re not adapting the existing ones?
Kristen: I think digital in general, allows us to do accommodations modifications easier and more dynamically, gives us more options than just traditional paper. We’ve always been very open. If somebody asks us, let’s say they need to do an evaluation of somebody with a visual impairment, they need to enlarge the stimuli items or whatever, we always give permission to do that, but that becomes more of a, we’re giving you permission. You can go and do that [00:48:00] but we don’t keep large format copies of those. It’s just something that the examiner would have to do.
In digital, we can make those things more dynamic, like I said, and more available now. So we have the opportunity to do those and give people the flexibility within an application where they can make those changes.
Dr. Sharp: I love that. That situation came up in our practice two weeks ago where we were working with someone who’s visually impaired and my first recommended, this was one of my postdocs, and my first recommendation was, well go look on Q-interactive and see if there’s a setting to just enlarge the font in the places where we need it. And it seems like that could be a relatively easy thing to build in if we needed to do something like that versus paper, which is what she ended up doing. It’s just photocopying and blowing up.[00:49:00] I wanted to also talk just a little bit about, how would I frame it? Maybe just decision making. That’s a very general heading, but we talked in our pre-podcast discussion about how digital really opens up the possibility to measure so many data points that are just not on the radar with paper. When you have all these different inputs and timing and screen dynamics, I don’t know, I’m making things up now, but there’s just so many data points that you could potentially measure.
I would love to hear how y’all work through that process. I would imagine in the beginning it was maybe like a, oh my gosh, look at all this data at our fingertips. And there was some excitement and then it was like, oh, with great power comes great responsibility. You got to figure out how to pare it [00:50:00] down and actually measure the things that matter. I don’t know. I’m curious just what that discovery process was like when you realized you could measure so much, and then how you did go about narrowing it down to the things that hopefully matter.
Dr. Lisa: I can talk about that. The first time that we opened up the output that was available for us to look at on the D-KEFS across, it was seven tests at the time. We had over a 100,000 data points that we could have looked at. Everything from the loading time to display an image, the time it took from when someone, every time someone touched anything on either the examiner’s side or the examinee side, there was a timestamp collected.
So like just hitting this show picture button, we have a [00:51:00] time for that. We have a time from show picture button to starting the timer. There’s so many data points that were coming across that we had to start, okay, while we have this dream of a 100,000 data points and being meaningful, what is it that we’re actually trying to get at in this assessment?
And so what are the things that are specific to the construct that we’re specifically looking at first? And so you know them obviously. The responses by the examinee are very important. So we wanted to capture everything there. And so we narrowed down. We don’t care how quick the examiner is responding, for example. So we could eliminate some of that data.
And then you start looking at, okay, well, how does this fit together? I don’t need the noise of how long the screen was displayed for an interactive item, when I have all the individual [00:52:00] times for how they interacted with the item. So you start getting out these global pieces.
And then you start looking at what is it realistically going to mean if I provided times for, I don’t know, how many times the person coughed. Those aren’t picked up by the iPad except, well maybe they are if they’re done doing a verbal item where it’s being recorded. You just start looking at where’s the noise and then where are the things that actually get at the construct we’re measuring? So which ones of these actually reflect for inhibition or poor flexibility or poor memory? And then you whittle it down.
Some of our tests measure time, some of them don’t. So the ones where time didn’t really feel relevant, we stopped collecting all the time variables. I [00:53:00] know that my psychometrics team probably will never hear this, but would be happy if I said, oh yeah, we narrowed it down to 2,000 variables but I think we narrowed it down to 12,000 variables that are then used to calculate a 100 scores. Those then we can start to have in the reports them in a way that is making meaningful, like gather, grouping the data in an appropriate way.
I think one of the things, and maybe this is your downstream question, one of the things that we haven’t really tapped into that I think is probably a future state, is report generation. What are the things that we can actually do in digital that a report with just tables of data doesn’t really tell us? I’m excited maybe the next wave of when we look at report generation and we start looking at how can we display data, how can we group data together and look [00:54:00] at it in different ways because I think that’s a very exciting step, but right now I’m on the other side of just how many scores can I give you that makes sense. And you can actually meaningfully interpret with an individual.
Dr. Sharp: Right. Think about this as either a statistician or psychometricians’ dream or nightmare. You have 12,000 data points. I also think though, too, that who knows, this would be a great place to deploy some AI to maybe comb all 12,000 data points and find the ones that actually relate to one another without the human need, if that makes sense. I wonder if that’s even a possible thing to do and then the AI can determine which things are meaningful and which ones aren’t, outside the typical, just like construct driven data points. So many possibilities.
Dr. Lisa: So many.
Dr. Sharp: Well, I [00:55:00] know our time is starting to run short. It always goes by really fast. I know that we wanted to touch on this idea of gender neutrality and testing. I think that’s a really important area to touch on. Let’s talk about that just for a bit. Who would like to get us started on that?
Kristen: I think we can divide this one up again too. I think Lisa can talk about it from a research perspective and I can talk about a lot of the innovations that we’re trying to do along this and positive changes that we’re reacting to from our customers in this regard. So do you want to just talk a little bit, Lisa, to start and I’ll go into the reports and things like that?
Dr. Lisa: Sure. I thought you were going to go first.
Kristen: Oh, sorry.
Dr. Lisa: It’s okay. I think the big piece here is the increased flexibility of a digital interface that is not really possible in a 10-year [00:56:00] publication, 15-year publication cycle.
Gender has really become, when you think about 2008, when the last WAIS, WMS were published, there wasn’t a whole lot of discussion going on about gender. That came about over the past 10 years, though I will say it existed well before that. It just wasn’t being acknowledged in the way that it is currently. And so how do we respond in a paper world to a shift in societal labeling? This is not really great because I don’t think labeling’s the right word, but they’re the labels that we attach when we describe the patient in our reports.
And prior to probably, well, I’d say 10 years ago, it was male, female. And now that has expanded, rightfully so. And how do [00:57:00] we catch up on that paper? You can’t. You have to re-publish an entire thing and then that’s just not there. In digital, you can make that shift very quickly and universally across your products.
I know Kristen’s going to talk about the report features but just being able to adequately describe and accurately describe your client without having to enforce specific categories on them is a huge capability in digital that’s not always there in the paper world where the stratifications and things are based on census data.
In general, we still use sex as our differentiating factor for stratification. If there are known differences or observed differences between males and females in development or across the lifespan where different norms are needed for that particular construct, we still do sex-based norms [00:58:00] but we’re starting to investigate the idea of what is it? How do we more appropriately serve these marginalized populations who aren’t falling into those binary categories?
We’re talking about it from stratification point of view, sometimes over on the research side but Kristen’s group and a group, we have a gender equity group here in Pearson, particularly within clinical, to address customer concerns, like what norms do you apply? And then Kristen’s group comes in and says, okay, how do we handle this in the reporting features where very frequently we tie gender pronouns to our paragraph descriptions. Do you want to take over there, Kristen or do you want…?
Kristen: Yeah. You were summarizing it there. We’re very aware of this issue and it’s something that comes to the gender equity group [00:59:00] questions weekly on this topic. Now, like Lisa said, this wasn’t something we heard about five years ago, especially not 10 years ago, but we’re hearing two questions, how do I interpret my results based on the way this individual identifies? The other issue is, why are you making me identify a person because there’s no good reason for you to be doing that.
And like Lisa said, a lot of things were just historic in nature on the platforms where if you chose male, then your report came out with he/him pronoun usage. That’s really what it boiled down to. So we’ve gone through and tried to eliminate that wherever possible. So if that wasn’t a requirement, then we’ve removed that. So if you don’t choose male or female or if you choose other, then you’re going to have a gender. If you choose [01:00:00] neither or other, then you have a gender neutral report.
And in instances where we do offer combined norms, then combined norms can always default if somebody selects other or doesn’t make a selection in that case. We’re trying to expand that as much as possible to be as inclusive as we can.
Dr. Sharp: Yeah, that’s so important. There’s the need to be pretty agile, I think, in a lot of areas of our field when it comes to inclusivity and this is just one of them. I take your point really strongly that that’s very challenging in a 10 to 15-year test development cycle. To me, one of the biggest advantages of a digital administration is that you have a lot more flexibility.
Dr. Lisa: Yeah, the [01:01:00] flexibility is hard to, you can’t really do that unless you’re doing annual publications kind of thing. That there’s not many folks that have that business model, at least in our industry.
Dr. Sharp: Sure. Well, let me ask a dumb question. This is truly naive and I hope that some folks out there might also have this question. Just to make it super clear, how does digital allow for more flexibility and almost mimic a shorter publication cycle? I’m actually not exactly sure how that happens. And how you get around the idea of, you still have norms. You still have standardization that you’re working with. So what’s happening behind the scenes in this case that gives more flexibility compared to a paper measure.
Kristen: Specifically, around the gender topic?
Dr. Sharp: Yeah.
Kristen: So we can make [01:02:00] changes to the actual interface within the system. Things that were required variables on the platforms, if they weren’t really necessarily requirement, we’ve removed them. If there were some that were just historic, for example, I’ll use the WISC because that’s one that people would pray to me all the time and say, on Q-global, why do you make me say whether or not my patient was male or female?What difference does it make?
So we said, well, the reason why that variable was mandated was only because of the report because it was using the pronouns within the report, which seemed like not a real valid reason. That’s why I said if they chose other or nothing was selected, then we took that report and we had our editors and our research directors go back and rewrite that report. So that was a gender neutral report.
So that’s how we were able to pivot on that very quickly. Those are [01:03:00] the types of things that we’ve been able to do. And also, like the combined norms example, if somebody chose other before, we would go back and say, no, you have to tell us what norms group. Well now we’re just going to assume if you said other that you want the combined norm group.
Dr. Sharp: Got you. So maybe that answers my second question then, the changing the report output. That makes sense to me. That feels very easy. I’m at the risk of totally minimizing the work that goes into it, but that feels…
Kristen: It’s a lot of work.
Dr. Sharp: Okay. I don’t know what I would say, labor intensive. The second question though then is how do we get around the norms issue? I was under the assumption, I suppose, that sex-based norm like that is important in the scoring. And so maybe it is just that there are now combined norms sets for everything. Is that what you’re saying? We can get around?[01:04:00] Dr. Lisa: Yeah. I’ll talk about the norms a little bit. There aren’t always combined norms if there are sex differences that are known. For example, the CVLT-C has male norms and female norms and you have to designate on that. That’s a 25-year-old project product. I’m not sure if we redid that, we’d see the same level of differences between males and females but language development historically has been different across the sexes.
And so you get stuck if there is not a combined norm there. Going forward, we may create a combined norm if that is the most appropriate approach to do with a child who is not identifying as male or female. There isn’t enough research out there yet to say that. And we don’t collect huge samples. We now [01:05:00] allow a child that identifies or anyone who identifies as non-binary, anything other than male and female, to be in our samples but they are never going to be as big as the whole sample.
So there’s probably not an ability there then to use the two people per age group to then generate a norm set, but we can look at that group in comparison to what the rest of them are and see, hey, what is the most appropriate approach for this product with this population? And that might be creating a full combination norm where there aren’t male and females separated. And then creating a male and female separation if folks are wanting that for their particular individual in front of them.
I think the biggest thing that we’re learning through digital and through these changes that are occurring and actually, let’s just say it’s [01:06:00] the increasing awareness that we have always generalized too necessarily to maybe too much, maybe too little sometimes. That we’re offering more choices so that the examiner can do a lot more decisions around what is the most appropriate thing for the patient sitting in front of me as opposed to, you must do one of these two things.
I think that digital gives us more opportunity to provide more choices. Like more, hey, you select this whereas you don’t really want to have to search through a 600-page norm data set or data book to find, okay, what am I going to do? Because you can see some of the kits that you’ve received recently, that they’re just huge amounts of norms tables and sorting through and trying to even find the right table can be daunting, let alone, [01:07:00] hey, I’ll buy a scoring assistant just so I don’t have to go find the right table.
Dr. Sharp: That’s true. I appreciate y’all talking through that and bearing with some of these questions. That’s an area that is still a little mysterious sometimes. It’s a little norming and standardization process, so it’s nice to hear that we have a lot more flexibility with digital in this realm than many other realms. I’m aware, gosh, we as always have talked about quite a bit, and at the same time, come to the end of our time together. So I will say thanks once again and congratulations on getting this new Q-interactive app out there. I know a lot of folks are looking forward to D-KEFS 2.0 in the next year and a half, I guess, end of 2024, we said. [01:08:00] And many other measures. Hopefully, there’s several more in the queue.
Dr. Lisa: Literally in the queue.
Dr. Sharp: Hey, that’s pretty good, Lisa. That’s a good note to end on. I don’t think we can top that.
Kristen, Lisa, it was great to talk with you again. Thanks.
Dr. Lisa: Thanks, Jeremy.
Dr. Sharp: All right y’all, thank you so much for tuning into this episode. Always grateful to have you here. I hope that you take away some information that you can implement in your practice and in your life. Any resources that we mentioned during the episode will be listed in the show notes, so make sure to check those out.
If you like what you hear on the podcast, I would be so grateful if you left a review on iTunes or Spotify or wherever you listen to your podcast.
And if you’re a practice owner or aspiring practice owner, I’d invite you to check out The Testing Psychologist mastermind groups. I have mastermind groups at every [01:09:00] stage of practice development; beginner, intermediate, and advanced. We have homework, we have accountability, we have support, we have resources. These groups are amazing. We do a lot of work and a lot of connecting. If that sounds interesting to you, you can check out the details at thetestingpsychologist.com/consulting. You can sign up for a pre-group phone call and we will chat and figure out if a group could be a good fit for you. Thanks so much.
The information contained in this podcast and on The Testing Psychologist website are intended for informational and educational purposes only. Nothing in this podcast or on the website is intended to be a [01:10:00] substitute for professional, psychological, psychiatric, or medical advice, diagnosis, or treatment. Please note that no doctor-patient relationship is formed here. And similarly, no supervisory or consultative relationship is formed between the host or guests of this podcast and listeners of this podcast. If you need the qualified advice of any mental health practitioner or medical provider, please seek one in your area. Similarly, if you need supervision on clinical matters, please find a supervisor with expertise that fits your needs.