|
Are You Smarter Than a Chatbot: Exploring Generative Artificial Intelligence (AI)’s Performance on a Mock Board Certified Behavior Analyst (BCBA) Examination and Complex Relational Framing Programs |
Sunday, May 26, 2024 |
12:00 PM–12:50 PM |
Convention Center, 100 Level, 105 AB |
Area: VBC; Domain: Translational |
Chair: Albert Malkin (Western University) |
Discussant: Adrienne Jennings (Daemen University) |
CE Instructor: Adrienne Jennings, Ph.D. |
Abstract: Generative artificial intelligence (AI) tools, such as Chat GPT, are gaining popularity for their ability to generate human-like responses to requests. These chatbots use natural language processing and machine learning algorithms to produce these responses. This symposium will discuss two studies related to behavior analysis and AI. Study 1 evaluates ChatGPT’s performance on the 5th edition task list mock exam developed by Behavior Development Solutions (BDS). Results indicate that ChatGPT-3.5 scored 61% and ChatGPT-4.0 scored 89%. ChatGPT-4.0 performed best in experimental design (100%); measurement, data display, and interpretation (95%); ethics (94%; and behavior change procedures (94%). ChatGPT performed worst in selecting and implementing interventions (75%) and behavioral assessments (78%). In study 2, various generative AI chatbots performance was assessed across several PEAK Relational Training System programs. Preliminary results indicate that generative AI chatbots respond more accurately in the context of coordination in the equivalence modules (100% correct across 120 trials) but are less capable of accurate responding when provided with complex relations (i.e., opposition) in the PEAK transformation modules (35.83% across 120 trials). Implications, considerations and limitations for using generative AI in behavior analytical practice and research will be discussed. |
Instruction Level: Intermediate |
Keyword(s): artificial intelligence, language, technology |
Target Audience: Some background in relational frame theory and artificial intelligence technologies |
Learning Objectives: At the conclusion of the presentation, participants will be able to: 1. Describe how generative artificial intelligence systems learn 2. Describe the implications of using ChatGPT in behavior analytic practice 3. Describe generative artificial intelligence systems ability to engage in complex language |
|
ChatGPT (and Other Language Models): Considerations for Behavior Analysis Education, Research, and Practice |
(Basic Research) |
JUSTIN BOYAN HAN (University of Florida), David J. Cox (RethinkFirst; Endicott College), Meka McCammon (University of South Florida), Sarah E. Bloom (University of South Florida), Stephen E. Eversole (Behavior Development Solutions, LLC), Joel Weik (Behavior Development Solutions ) |
Abstract: Generative artificial intelligence (AI) built using Large Language Models (LLMs) has shown to be capable of augmenting the work and decisions of practitioners and researchers across various disciplines. One LLM, ChatGPT, is gaining public attention because its total capabilities, human-sounding linguistic performance, and widespread accessibility exceed competitor LLMs. In the current paper, we assessed the performance of two versions of ChatGPT on the Behavior Analyst Certification Board (BACB) 5th edition task list mock exam, developed by Behavior Development Solutions (BDS). Overall, ChatGPT-3.5 scored 61% and ChatGPT-4.0 scored 89%. ChatGPT-4.0 performed best in experimental design (100%); measurement, data display, and interpretation (95%); ethics (94%; and behavior change procedures (94%). ChatGPT-4.0 performed worst in selecting and implementing interventions (75%) and behavioral assessments (78%). Based on these results, we discuss the implications, considerations, and limitations that ChatGPT and other LLMs currently pose for behavior analysts working in behavior analytic education, practice, and research. |
|
Chatting With Machines: Can Artificial Intelligence (AI) Chatbots Truly Grasp Human Language, Assessed via the PEAK Relational Training System |
(Basic Research) |
LAUREN ROSE HUTCHISON (Missouri State University ), Albert Malkin (Western University), Allison Kretschmer (Progressive Behaviour Solutions/Western University), Marc J. Lanovaz (Université de Montréal) |
Abstract: Generative artificial intelligence (AI) tools, such as ChatGPT, function as a chatbot where users can enter queries and receive a response in real time. These tools utilize AI algorithms, such as deep learning, to process and generate natural language. Relational frame theory (RFT) provides an explanation of how humans acquire and produce complex language, through direct reinforcement and derived relational responding. The PEAK Relational Training System is a curriculum that has been used to teach derived relational responding. The purpose of this study was to evaluate if various chatbots could make derived relations between stimuli. Several programs from the equivalence and transformation modules of PEAK were selected to run with ChatGPT-4, Bing Chat, and Perplexity. The selected PEAK programs included topics such as math, grammar, science, perspective taking, and social skills. Preliminary results indicate that ChatGPT-4 is capable of deriving equivalence relations in the context of coordination (ChatGPT-4, 100% correct across 120 trials; Perplexity, 91.67%) and is less capable of deriving complex relations in the context of opposition (ChatGPT-4, 35.83% across 120 trials; Perplexity, 21.67%). Implications for how AI might be used to simulate human language learning will be discussed, as well as AI’s ability to engage in complex language |
|
|