AI Systems & Platform Design
AI Prompt Engineering · AI Enablement · Johns Hopkins Carey Business School
Overview
I co-designed an Ai platform in the summer of 2025 and it was selected for presentation at the 2025 INFORMS International Meeting in Singapore — one of the most prestigious operations research and management science conferences globally.
That work demonstrated production-level AI architecture and governance capability, and directly preceded my recruitment to lead AI enablement for the Operations Management program at Johns Hopkins Carey Business School — a funded initiative to incorporate AI into graduate education for the first time.
What I Built
My role scope was distinct: prompt engineering, system architecture, coding, documentation ownership, AI literacy instruction, governance research and framework development, and direct collaboration with the IT dept, and Professor Warren on course redesign to incorporate AI.
AI-Powered Autograder - The first of its kind for a core MBA program
Built and test across HopGPT · LiteLLM · Claude · GPT · Llama · Hunch.tools · AnyLogic · Python
The core challenge was not simple automation. Operations Management assessments are open-ended and case-based — requiring evaluation of quantitative reasoning, analytical assumptions, and the quality of operational recommendations. Correct answers alone were insufficient.
I designed a multi-layered evaluation architecture:
Prompt architecture separating quantitative scoring from qualitative reasoning evaluation
Python-based scoring logic with validation loops — the system flagged ambiguous responses rather than overconfidently grading them
Benchmarked AI grading against human grading samples, iterating until consistency was achieved across accuracy, reasoning quality, and feedback usefulness
Redesigned course case instructions to align with AI-assisted workflows without compromising academic rigor
Business Cases the system evaluated
-
queuing theory, capacity analysis, bottleneck identification, staffing recommendations
-
value stream mapping, Lean Six Sigma waste identification, process improvement recommendations
-
AnyLogic digital twin simulation modeling of Operating Room (OR) flows, utilization vs. patient wait time trade-offs
The Challenge
We often forget that higher academic institutions are not only research and teaching institutions — they are businesses first. And when AI arrived at the doorstep of higher education, it challenged everything: the ways we work, think, teach, and learn.
Faculty faced a problem with no clean solution: They knew students were using AI but had no visibility into which models, how extensively, or at what level of AI literacy or experience. There was no way to control for it — and attempting to ban it entirely was both futile and counterproductive in a world where AI fluency is increasingly a professional requirement.
The old frameworks weren't helping either. Rigid rubrics, rote multiple choice quizzes, and fixed case formats were designed for a world where information was scarce and answers were verifiable. AI tooling and critically thinking of new workflows exposed how much of traditional graduate assessment was measuring compliance rather than genuine thinking.
The deeper design challenge was this:
How do you redesign graduate-level coursework for students encountering AI as a serious tool for the very first time — ensuring they develop genuine critical thinking rather than sophisticated copy-paste habits? How do you make AI usage visible, accountable, and educationally productive rather than a shortcut that undermines learning?
That was the problem this initiative was built to solve.
The Platform: HopGPT
All development was built on HopGPT — Johns Hopkins' enterprise AI gateway powered by LiteLLM, providing secure unified access to frontier models including Anthropic Claude, OpenAI GPT, and Meta Llama through a single interface. I was among the earliest practitioners who worked with the IT department building production systems on the platform — benchmarking model performance across all available LLMs, testing API capabilities, and working directly with JHU IT to surface improvements to their documentation and developer processes. I functioned simultaneously as a builder, a tester, and an informal product collaborator during one of the platform's earliest institutional deployments.
Business Cases Evaluated
What services do you offer?
1
We offer a range of solutions designed to meet your needs—whether you're just getting started or scaling something bigger. Everything is tailored to help you move forward with clarity and confidence.
How do I get started?
2
Getting started is simple. Reach out through our contact form or schedule a call—we’ll walk you through the next steps and answer any questions along the way.
What makes you different?
3
We combine a thoughtful, human-centered approach with clear communication and reliable results. It’s not just what we do—it’s how we do it that sets us apart.
How can I contact you?
4
You can reach us anytime via our contact page or email. We aim to respond quickly—usually within one business day.
Business Cases the System Evaluated
Anteco Greco Coffee ShopQueuing theory, capacity analysis, bottleneck identification and staffing recommendations
Hip Op SurgicalAnyLogic simulation modeling of operating room flows, utilization vs. patient wait time trade-offs
BUCC Casevalue stream mapping, Lean Six Sigma waste identification and process improvement recommendations
The Co-Creation Framework
The most substantive governance contribution of this engagement grew from a question that had no easy answer: how do we ensure students are genuinely learning, and not simply outsourcing their thinking to AI?
Drawing on empirical governance research and ongoing classroom reception analysis, Professor Warren and I co-developed the Co-Creation Framework — a pedagogical and governance model that repositioned AI as a collaborator rather than an answer engine. Students were required to submit Chain of Thought Transcripts alongside their work — documenting their prompting flow, follow-up reasoning, and how AI shaped their path to the final output. This made thinking visible, preserved critical reasoning, and created an auditable record of human-AI collaboration.
The framework received strong reception among Johns Hopkins faculty and was subsequently adopted across multiple courses. We often forget that higher academic institutions are not only research and teaching institutions — they are businesses first. And at a moment when AI is challenging every sector to fundamentally rethink its workflows, this work represented a genuine opportunity to experiment at the frontier: deploying responsible AI governance and reimagining pedagogical methods to meet both students and faculty where they are. That intersection of institutional change, human behavior, and AI systems is exactly where I operate best.
The Co-Creation Framework was developed through empirical research, classroom iteration, and faculty collaboration. View the full research deck below.
Beyond the Autograder
The quiz problem surfaced a deeper issue. Weekly module quizzes were multiple choice — and while students were permitted to form study groups, Professor Warren received consistent feedback that some students were sharing answers rather than genuinely engaging with the material. The format wasn't built for real learning. It was built for completion.
In a teaching team discussion about what to do differently, I did something unprompted: I designed and coded an interactive quiz experience — incorporating game-like mechanics, encouraging feedback loops, and psychological motivators that traditional assessments ignore entirely. Test-taking anxiety and low engagement are design problems, not student problems.
That conversation opened a larger question: what if we replaced multiple choice quizzes altogether?
The team had been considering about small group discussions as idea generators — we were inspired by how intimate classroom environments naturally produce more open dialogue, deeper engagement, and genuine peer learning. We began exploring whether AI could moderate that experience at scale. I attended vendor evaluation meetings with Breakout Learning — an AI-moderated discussion platform built for higher education — reviewing course material demos, contributing feedback on customization and pedagogical fit, and helping shape the decision to move forward.
The concept: rather than static cases with fixed answers, an AI moderator platform facilitates live small-group sessions — asking Socratic questions, following up dynamically, tracking participation equity, and generating session summaries for the teaching team. A static case becomes a dynamic one. Students learn to ask better questions — which is, not coincidentally, exactly what good consultants do.
The contract was signed and deployment was scheduled for the following term. I concluded my engagement in December 2025, just before rollout — but the groundwork, vendor relationship, and implementation framework were in place.
Driven by curiosity and built on purpose, this is where bold thinking meets thoughtful execution. Let’s create something meaningful together.
Additional Scope
Taught AI literacy (core concepts, prompting, tools education, workflow integration) across 120+ MBA students per term
Held open Zoom office hours for student questions on AI usage and collaboration
Conducted empirical research on AI governance and student engagement
Provided human personalized feedback alongside AI grading to address pushback on AI-only evaluation
Collaborated with Professor Warren on code review and joint system refinement
Owned all documentation for the autograding system and course AI integration
Includes Prompt Library of Business Use Cases, Business Logic/Operations Templates
Includes Digital Twin AnyLogic business case libraries
Outcomes
Implementing AI through the Autograder that I created, along with JHU IT’s HopGPT release at the institution resulted in
Reduced grading turnaround time by 2 days across 120+ student cohort, over 2 terms, with impetus on increasing critical thinking simultaneously—leading to further implementation of AI-moderated break out discussions to prioritize engagement over rote multiple-choice tests
~30% increase in student learning engagement, and an unknown number of increased AI literacy via usage of platforms, prompt scripts, AI literacy sessions provided
Improved evaluation consistency and feedback quality at scale; providing graduate level students with rich, live feedback beyond a static rubric
Co-Creation Framework adopted beyond original course following internal faculty presentation
Contributed to JHU IT's HopGPT API development; collaborated on bug-fixes, tokenization and features development as one of the platform's earliest production users
Demonstrated that AI can evaluate complex reasoning in business case analysis— not just automate rote tasks
Closing
What this work proved is the same thing enterprise AI initiatives require: that deploying AI responsibly at scale demands equal investment in system architecture, governance design, and the human layer surrounding it. The environment was academic. The discipline was production-grade.
Prior to this role, I co-designed a multi-agent AI platform recognized at the 2025 INFORMS International Meeting in Singapore — cited in: Domenge, J., Pandey, R., Simmonds III, M., Warren, G. & Xu, M. (2025, July 20–23). From Code to Confidence: How Students Built a Multi-Agent AI to Navigate Job Interviews. INFORMS International Meeting, Singapore.