Student-built AI tool using Southern Accent archives

By: Joshy Kasahara

I am a graduate student from Tokyo, Japan, currently studying at Southern Adventist University as I work toward becoming an AI engineer.

I built Hana AI from the ground up as part of my graduate work in applied computer science. It is a self-hosted AI research assistant built using Retrieval-Augmented Generation (RAG), a method in which the AI looks up relevant information from a document collection before answering a question. This allows the system to work with local knowledge collections, such as materials related to the university, and helps it produce responses that are grounded in source material rather than relying only on general AI knowledge.

Hana AI is designed to answer questions based on specific collections of text documents through a chat-style interface similar to ChatGPT. Google’s NotebookLM also offers a similar kind of experience, but Hana AI runs entirely on the School of Computing server, which gives the university more control over its data and deployment. It can also be offered as a centrally managed login-based service, giving the university more flexibility in providing access, while NotebookLM sharing is managed at the notebook level, and public sharing depends on the type of account being used.

The name “Hana AI” reflects the personality and purpose behind the project. It is partly inspired by the name “Hannah,” giving the system a more human and approachable feel.

In Japanese, hana means “flower” and can also suggest something special. The name reflects sensitivity to discovery, which aligns with the project’s goal of helping users uncover meaningful details in large collections of documents.

The project began as an extension of a class project for Data Mining and Analytics. For that class, Dr. Bob Young from the Academic Administration office provided about 1,200 typewritten university letters as a dataset.

I later expanded the system by adding over 200 issues of the Southern Accent from 2014 to 2025. Because the university has archived Accent issues dating back to 1929, the system has the potential to work with more than a century of campus newspaper history.

To answer a question, the AI can read through many pages across the archive. It then uses semantic search powered by FAISS, an open-source technology developed by Meta, to retrieve the most relevant passages before generating a response.

One practical use is exploring long-term changes on campus. For example, asking Hana, “How has the campus changed over time?” can draw from years of Southern Accent coverage and produce a source-based summary about construction projects, enrollment-driven expansion, accessibility improvements and changes in student life spaces.

Because it can read across more than a decade of archived articles, it can show how a topic developed over time rather than simply returning isolated facts.

The system behind Hana AI is designed to work with very large collections of documents—potentially ranging from? 10 million to 100 million textbook-sized pages, roughly comparable to between 8,000 and 80,000 NIV Bibles’ worth of text, depending on formatting and edition.

I recently presented Hana AI at a Science Department luncheon, where I shared the technology behind the system and demonstrated its capabilities. At this stage, Hana AI is intended for internal and research-oriented purposes, but it shows how AI can strengthen university research infrastructure and make institutional knowledge more accessible in an intuitive and practical way.

Demo page: https://report-to-southern-accent.vercel.app/ 

Student-built AI tool using Southern Accent archives

Share this story!

Leave a ReplyCancel reply

/

/

/

Student-built AI tool using Southern Accent archives

Share this story!

Leave a ReplyCancel reply

Discover more from The Southern Accent