The advent of generative Artificial Intelligence (AI) has opened up new possibilities in many fields, including the meticulous task of database curation for Information and Referral (I&R) organizations. Among the various AI tools available, Generative Pre-Trained Transformer (GPT) models stand out for their capability to process and generate human-like text. Many of us have experienced this first hand in the 12 months since ChatGPT first released to the public, which uses GPT behind the scenes.
In an intriguing experiment, I collaborated with a US-based 2-1-1 service to explore how well GPT could apply the principles of an I&R Style Guide to improve real-world resource data.
Community Resource Specialist – Data Curators are tasked with the crucial job of ensuring that resource data adheres to their formal Style Guides. However, varying interpretations and human error can lead to inconsistencies, not to mention the time constraints that often prevent curators from addressing every record.
To harness GPT's potential, I interpreted the Style Guide into a series of "prompts" (essentially, instructions) that GPT could use to evaluate each record. It's a blend of art and science, especially to minimize errors and hallucinations.
My focus was on the Eligibility field, which presented an ideal blend of complexity and variability for this test. The Style Guide rules were well-written with some qualitative guidance somewhat open for interpretation.
After feeding over 9,000 program records into GPT via its Application Programming Interface (API), I received scores indicating each record's adherence to the Style Guide, along with an explanation of scores and suggestions for improvement where necessary.
The initial results were promising. GPT determined that 45.7% of the records exhibited very good conformance. Another 34.7% showed moderate conformance, while 18.6% indicated a good bit of room for improvement. A remaining 2.1% of records could not be scored due to scant or no data - mostly in inactive records.
These findings underscore the high quality of work performed by human curators while also highlighting opportunities for enhancement. About half of the records could benefit from revisions to improve clarity and accessibility for various stakeholders, including 2-1-1 agents, help-seekers, data partners, researchers, and the general public.
The AI-generated scores, explanations, and suggested rewrites provide valuable insights into interpreting the Style Guide and prioritizing updates. While not flawless, these AI contributions serve as a solid foundation for further refinement.
The experiment is ongoing, with further analysis comparing the performance of different GPT versions (3.5 and 4) and extending the testing to other database fields. I may try additional AI models like Claude, PaLM 2, Gemini and others. I am also exploring streamlined processes and tools that will make it easier for people to review and integrate GPT's suggestions back into the resource records in the database.
This exploration is just the beginning. If you're intrigued by the potential of AI to enhance data curation and would like to be part of this innovative journey, I invite you to join me. Together, we can responsibly start harnessing the incredible power of generative AI in a wide range of I&R activities, including database curation.