Intelliglyph

Prompt Engineering Reimagined for Better User Experience

This project is based on on OpenAI's Democratic Input to AI grant program. The policy statement worked on for this project was to explore ways in which AI models can balance and reduce the possibilities of generating homogenous or diverse outputs in order to make AI more inclusive.

Team:

Dr. Setor Zilevu (Instructor)

Helen Fang

Yanrong Feng

Dr. Setor Zilevu (Instructor)

Helen Fang

Yanrong Feng

Dr. Setor Zilevu (Instructor)

Helen Fang

Yanrong Feng

My Role:

UX Research

UX Designer

UX Research

UX Designer

UX Research

UX Designer

Year:

2020

Tools:

Figma

Midjourney

Figma

Midjourney

Unity

Figma

Meta Quest 3

The Problem

The initial prompt question provided by OpenAI

When generative models create images for underspecified prompts like “a CEO”, “a doctor”, or “a nurse”, they have the potential to produce either diverse or homogeneous outputs. How should AI models balance these possibilities? What factors should be prioritized when deciding the depiction of people in such cases?

Literature Review

KEY INSIGHT 1

As demonstrated in the image on the right the results generated most of the CEO's and lawyers appear to be men whereas teachers and housekeepers are displayed as women.

KEY INSIGHT 2

AI generated outputs are heavily based on the stereotypes fed to them suggesting when images are generated high paying jobs are demonstrated as lighter skinned and low paying jobs are demonstrated as darker skinned

KEY INSIGHT 3

Text to image models such as midjourney reinforce gender and race stereotypes due to which people using these systems might unconsciously adopt these biases.

Project Objective

Based on the insights provided in the literature review

Given this context, the objective of this project is to explore ways to create an unbiased version of the text-to-image AI model, and to mitigate representational harms, and promote the representation of diverse cultures and visual identities. In addition to this, the aim is to empower users with greater control over the AI generated images.

Research Question

How can the integration of content mindfulness and prompt refinement in text-to-image AI systems empower digital content creators to produce inclusive and representative visual content, while mitigating the perpetuation of biases and stereotypes?

Ideation

User Personas

Project Hypothesis

Hypothesis 1: Prompt refinement and guardrail approach to help solve the problem of underspecified prompts.

Hypothesis 2: Designing the output UI to showcase diversity by create a place for user customization based on location, preferences and many more criteria.

User Research

User Interviews

6 Users

Discussion Guide

Surveys

23 Participants

100% - noticed biases and stereotypes

70% - 'often' or 'always' tweak their prompts

43% - spend 5 to 30 minutes fine-tuning prompts

13% - spend 30 minutes to an hour fine-tuning prompts

Survey Questions

Insights

High appreciation for speed & efficiency

Most users value the system’s quick and efficient generation of results

Prompt adjustment is a key step

The fact that all users engage in prompt adjustments highlights its importance in the user journey.

Need for better prompt guidance

Many participants seek for clearer instructions for creating effective prompts

Challenges in getting the right image

Common frustration in achieving the desired image

Low-Fid Prototype

3 versions of prototypes based on analysis

Suggested Categories
Version 1 a)
Suggested Categories
Version 1 b)
Detailed Tags & Generative Suggestions
Version 2
Full Sentence Suggestions
Version 3
Suggested Categories
Version 1 a)
Suggested Categories
Version 1 b)
Detailed Tags & Generative Suggestions
Version 2
Full Sentence Suggestions
Version 3
Suggested Categories
Version 1 a)
Suggested Categories
Version 1 b)
Detailed Tags & Generative Suggestions
Version 2
Full Sentence Suggestions
Version 3
Suggested Categories
Version 1 a)
Suggested Categories
Version 1 b)
Detailed Tags & Generative Suggestions
Version 2
Full Sentence Suggestions
Version 3

Usability Testing

Protocol

Task 1: Choose a Prompt

Participants are asked to choose one of the three given underspecified prompts.

Task 1: Choose a Prompt

Participants are asked to choose one of the three given underspecified prompts.

Task 1: Choose a Prompt

Participants are asked to choose one of the three given underspecified prompts.

Task 2: Generate a result in Midjourney

Generate in Midjourney. They are encouraged to comment on their opinions regarding the results.

Task 2: Generate a result in Midjourney

Generate in Midjourney. They are encouraged to comment on their opinions regarding the results.

Task 2: Generate a result in Midjourney

Generate in Midjourney. They are encouraged to comment on their opinions regarding the results.

Task 3: Prompt Refinement with our Prototypes

Use our prototypes to refine prompts and rate their experience on a scale of 1 to 5.

Task 3: Prompt Refinement with our Prototypes

Use our prototypes to refine prompts and rate their experience on a scale of 1 to 5.

Task 3: Prompt Refinement with our Prototypes

Use our prototypes to refine prompts and rate their experience on a scale of 1 to 5.

Task 4: Regeneration in Midjourney and Feedback

Share thoughts on the suggested prompts and prototypes.

Task 4: Regeneration in Midjourney and Feedback

Share thoughts on the suggested prompts and prototypes.

Task 4: Regeneration in Midjourney and Feedback

Share thoughts on the suggested prompts and prototypes.

Individual Insights

view details

Prototype 1A) and 1B)

INSIGHT 1

Preference for the V2 over V1 as it aligns with their previous experiences.

INSIGHT 2

The drag and drop feature of the text blocks changing the sentence structure may affect the purpose of the prompt.

INSIGHT 3

Too many colors could make it seem overwhelming, cartoonish, or not to be taken seriously. Words such as “of” could be hard to distinguish.

Prototype 2

INSIGHT 1

The users might be overwhelmed with so many options in the detailed dropdowns. The sentence structures might be restricted based on this.

INSIGHT 2

AI is usually monochrome and therefore seems like the colors are fake and might cause one to not take the output seriously.

INSIGHT 3

Users don’t want to spend too much time thinking, typing or choosing the prompt.

Prototype 3

INSIGHT 1

Users liked the suggested prompt options and the template format, finding them inspiring and efficient.

INSIGHT 2

Users questioned the prompt generation limits and the reasoning behind suggestions.

INSIGHT 3

Users requested an edit feature.

Overall Insights

Time savings

"Using templates makes this so much easier and faster. I appreciate not having to think too much about the setup and just dive into customizing my prompts"

Ability to edit prompts

"The ability to edit and tweak the selected prompts before generating images is smart."

Rationale behind suggested prompts

"I like the tags like 'Age', 'Gender', and 'Culture' in the prompt suggestions. It is thoughtful and enhance the rationale behind each prompt."

The Ultimate Goal

To help increase user control when generating prompts for AI generated images

High-Fi Prototype

Content Mindfulness

As users start typing, relevant categories will be highlighted and checked. This indicates that the content they input aligns with these content mindfulness parameters.

Prompt Refinement

If the user isn’t happy with the prompt or the image, a “Regenerate” button is provided to regenerate more results either on their current prompt or a new option.

History

History allows users to see the previous generated images. By clicking on one of the images, it opens up the metadata of the prompt.

Final Demo

Impact

Content Creation & Mindfulness

Currently not many solutions for content mindfulness, this will allow users to see they can create inclusive content

Diverse Content

Inclusive and efficient prompts in real time through the use of our tags and filters

Underspecified Prompts

Potential for users to understand how to phrase and write prompts

Raising Awareness

Greater understanding of the nuanced interactions between textual input and image output

Responsible/ Thoughtful Use

Provides a practical tool for users but also contributes to a broader discourse of text-to-image AI systems

Success Metrics

Less Biased Results

Decrease in biased images compared to original user inputs.

Click Rate

Users want to use the suggested prompt structures to generate their images

Time-Saving

Users successfully save time in prompt generation

Satisfaction Rates

How satisfied are the users with the generated results.

Effectiveness

Users can effectively use the guidance of the “Content Mindfulness” section by having all the categories eventually highlighted.

Conclusions & Leanrings

Critical Thinking on Human-In-The-Loop

Currently, there is not much content mindfulness solution for text to image AI in the market. Therefore, our goal is to help users craft inclusive and efficient prompts on the fly.

User-Centric Focus

In order to fully absorb the research question, we took a much longer process narrowing down our research question and form hypotheses. The prototype reflects our commitment to creating a user-centered and ethical text-to-image AI system.