Model Rankings
Vision models ranked by overall accuracy (exact + variant matches) out of 89 tests per model
| Rank | Model | Model Type | Accuracy | Avg Score | Cost per Run | Response Time |
|---|---|---|---|---|---|---|
⚛ claude-opus-4.6-fast | Proprietary | 91.0% | 92 | $6.02 | - | |
#2 | ✦︎ gemini-3.1-pro-preview-customtools | Proprietary | 90.0% | 89 | $1.94 | - |
#3 | bytedance-seed/seed-2.0-mini | Open | 88.8% | 90 | $0.03 | - |
#4 | ֎ gpt-5.1-chat | Proprietary | 88.3% | 81 | $0.12 | - |
#5 | ⚛ claude-opus-4.5 | Proprietary | 87.6% | 89 | $0.93 | - |
#6 | ✦︎ gemini-3.1-pro-preview | Proprietary | 87.6% | 87 | $1.76 | - |
#7 | moonshotai/kimi-k2.5 | Proprietary | 87.6% | 86 | $0.27 | - |
#8 | ⚛ claude-opus-4.7 | Proprietary | 87.5% | 90 | $1.16 | - |
#9 | ✦︎ gemini-3-flash-preview | Proprietary | 87.4% | 88 | $0.07 | - |
#10 | ֎ gpt-5-codex | Proprietary | 86.7% | 87 | $0.80 | - |
#11 | ⚛ claude-opus-4.6 | Proprietary | 86.5% | 86 | $0.89 | - |
#12 | z-ai/glm-5v-turbo | Proprietary | 86.5% | 83 | $0.36 | - |
#13 | ֎ o3 | Proprietary | 85.7% | 82 | $0.55 | - |
#14 | ⚛ claude-opus-4.1 | Proprietary | 85.4% | 85 | $2.60 | - |
#15 | ⚛ claude-sonnet-4.6 | Proprietary | 85.4% | 87 | $0.62 | - |
#16 | ✦︎ gemini-3.1-flash-image-preview | Proprietary | 85.4% | 86 | $0.05 | - |
#17 | bytedance-seed/seed-2.0-lite | Open | 85.2% | 89 | $0.11 | - |
#18 | ֎ gpt-5-pro | Proprietary | 84.8% | 76 | $17.99 | - |
#19 | ֎ gpt-4.1-mini | Proprietary | 84.4% | 84 | $0.07 | - |
#20 | moonshotai/kimi-k2.6 | Proprietary | 84.3% | 84 | $0.51 | - |
#21 | ֎ gpt-5 | Proprietary | 84.1% | 82 | $0.86 | - |
#22 | ֎ gpt-5.4-image-2 | Proprietary | 83.3% | 81 | $1.21 | - |
#23 | ֎ o4-mini-high | Proprietary | 83.0% | 80 | $0.62 | - |
#24 | ֎ gpt-5-image-mini | Proprietary | 82.0% | 79 | $0.45 | - |
#25 | ✦︎ gemini-3.1-flash-lite-preview | Proprietary | 81.6% | 85 | $0.03 | - |
#26 | ֎ gpt-5.3-chat | Proprietary | 81.1% | 81 | $0.36 | - |
#27 | ⚛ claude-3.5-haiku | Proprietary | 80.9% | 84 | $0.14 | - |
#28 | ⚛ claude-3.7-sonnet | Proprietary | 80.9% | 82 | $0.52 | - |
#29 | ֎ gpt-5-mini | Proprietary | 80.9% | 80 | $0.11 | - |
#30 | ֎ o3-pro | Proprietary | 80.5% | 77 | $5.70 | - |
#31 | ⚛ claude-opus-4 | Proprietary | 79.8% | 82 | $2.61 | - |
#32 | ✦︎ gemini-3-pro-image-preview | Proprietary | 79.8% | 82 | $1.03 | - |
#33 | ֎ gpt-5.4-mini | Proprietary | 79.8% | 82 | $0.11 | - |
#34 | ֎ gpt-5.5 | Proprietary | 79.6% | 80 | $1.39 | - |
#35 | ✦︎ gemini-2.5-pro | Proprietary | 79.5% | 83 | $0.90 | - |
#36 | ֎ gpt-5.1 | Proprietary | 79.0% | 78 | $0.44 | - |
#37 | ֎ gpt-5.3-codex | Proprietary | 78.9% | 82 | $0.68 | - |
#38 | baidu/ernie-4.5-vl-424b-a47b | Proprietary | 78.8% | 80 | $0.05 | - |
#39 | ⚛ claude-sonnet-4.5 | Proprietary | 78.7% | 80 | $0.54 | - |
#40 | ֎ gpt-5.1-codex-max | Proprietary | 78.7% | 82 | $1.36 | - |
#41 | ֎ gpt-5.5-pro | Proprietary | 78.7% | 80 | $39.95 | - |
#42 | ֎ o1 | Proprietary | 78.7% | 81 | $5.80 | - |
#43 | x-ai/grok-4.20-multi-agent | Proprietary | 78.7% | 80 | $3.48 | - |
#44 | ֎ o4-mini | Proprietary | 78.4% | 81 | $0.36 | - |
#45 | ✦︎ gemini-2.5-flash | Proprietary | 78.3% | 81 | $0.05 | - |
#46 | ✦︎ gemini-3.1-flash-lite | Proprietary | 78.0% | 82 | $0.03 | - |
#47 | ✦︎ gemini-2.5-flash-image | Proprietary | 77.9% | 81 | $0.03 | - |
#48 | ✦︎ gemini-2.0-flash-lite-001 | Proprietary | 77.8% | 78 | $0.01 | - |
#49 | ֎ gpt-5-image | Proprietary | 77.6% | 77 | $1.61 | - |
#50 | ֎ o3-deep-research | Proprietary | 77.6% | 79 | $39.23 | - |
#51 | ֎ gpt-4o-2024-08-06 | Proprietary | 77.3% | 79 | $0.20 | - |
#52 | ⚛ claude-sonnet-4 | Proprietary | 77.0% | 80 | $0.53 | - |
#53 | ✦︎ gemini-2.5-pro-preview-05-06 | Proprietary | 76.9% | 82 | $0.94 | - |
#54 | ֎ o4-mini-deep-research | Proprietary | 76.7% | 80 | $27.01 | - |
#55 | ֎ gpt-5.1-codex | Proprietary | 76.6% | 82 | $0.43 | - |
#56 | x-ai/grok-4 | Proprietary | 76.6% | 79 | $1.92 | - |
#57 | z-ai/glm-4.6v | Proprietary | 76.5% | 78 | $0.07 | - |
#58 | bytedance-seed/seed-1.6-flash | Open | 76.4% | 76 | $0.03 | - |
#59 | z-ai/glm-4.5v | Proprietary | 76.4% | 77 | $0.09 | - |
#60 | ֎ gpt-5.1-codex-mini | Proprietary | 76.3% | 78 | $0.07 | - |
#61 | ✦︎ gemini-2.5-pro-preview | Proprietary | 76.1% | 81 | $0.94 | - |
#62 | ֎ gpt-5.4 | Proprietary | 75.9% | 80 | $0.35 | - |
#63 | ⚛ claude-3.7-sonnet:thinking | Proprietary | 75.3% | 82 | $0.86 | - |
#64 | ✦︎ gemini-2.5-flash-lite-preview-09-2025 | Proprietary | 75.3% | 79 | $0.02 | - |
#65 | ֎ gpt-5.4-pro | Proprietary | 75.3% | 80 | $31.36 | - |
#66 | ֎ gpt-chat-latest | Proprietary | 75.3% | 81 | $0.76 | - |
#67 | openrouter/auto | Proprietary | 74.4% | 78 | $0.02 | - |
#68 | ֎ gpt-4o-2024-11-20 | Proprietary | 74.4% | 81 | $0.22 | - |
#69 | ֎ gpt-4o-mini-2024-07-18 | Proprietary | 73.7% | 74 | $0.35 | - |
#70 | ֎ gpt-5.2 | Proprietary | 73.5% | 78 | $0.62 | - |
#71 | ֎ gpt-4o | Proprietary | 73.0% | 77 | $0.20 | - |
#72 | ֎ gpt-5-chat | Proprietary | 73.0% | 79 | $0.14 | - |
#73 | ✦︎ gemini-2.0-flash-001 | Proprietary | 72.9% | 75 | $0.02 | - |
#74 | qwen/qwen3-vl-30b-a3b-thinking | Open | 72.9% | 76 | $0.05 | - |
#75 | ֎ gpt-4o-2024-05-13 | Proprietary | 72.7% | 78 | $0.40 | - |
#76 | ֎ gpt-4.1 | Proprietary | 72.2% | 82 | $0.19 | - |
#77 | qwen/qwen3-vl-235b-a22b-thinking | Open | 71.6% | 78 | $0.17 | - |
#78 | ֎ gpt-5.2-chat | Proprietary | 71.6% | 78 | $0.45 | - |
#79 | ֎ gpt-5.2-pro | Proprietary | 71.3% | 77 | $7.27 | - |
#80 | ֎ gpt-4o-mini | Proprietary | 71.1% | 77 | $0.35 | - |
#81 | ✦︎ gemma-4-31b-it | Proprietary | 71.1% | 76 | $0.01 | - |
#82 | qwen/qwen2.5-vl-72b-instruct | Open | 69.7% | 78 | $0.05 | - |
#83 | qwen/qwen3-vl-235b-a22b-instruct | Open | 69.7% | 79 | $0.06 | - |
#84 | bytedance-seed/seed-1.6 | Open | 67.8% | 76 | $0.10 | - |
#85 | ⚛ claude-3-haiku | Proprietary | 67.4% | 74 | $0.04 | - |
#86 | x-ai/grok-4.20 | Proprietary | 67.4% | 73 | $0.15 | - |
#87 | ✦︎ gemini-2.5-flash-lite | Proprietary | 67.1% | 73 | $0.02 | - |
#88 | amazon/nova-premier-v1 | Proprietary | 66.3% | 74 | $0.80 | - |
#89 | qwen/qwen-vl-plus | Open | 66.3% | 75 | $0.04 | - |
#90 | ֎ gpt-4-turbo | Proprietary | 65.9% | 71 | $0.81 | - |
#91 | qwen/qwen3.6-plus | Open | 65.5% | 76 | $0.74 | - |
#92 | qwen/qwen3.5-plus-02-15 | Open | 65.2% | 77 | $0.52 | - |
#93 | ֎ gpt-5.2-codex | Proprietary | 64.5% | 70 | $0.54 | - |
#94 | ✦︎ gemma-3-27b-it | Proprietary | 64.0% | 73 | $0.01 | - |
#95 | 🦙 llama-4-maverick | Open | 64.0% | 72 | $0.06 | - |
#96 | mistralai/mistral-medium-3 | Open | 64.0% | 75 | $0.07 | - |
#97 | mistralai/pixtral-large-2411 | Open | 62.9% | 71 | $0.77 | - |
#98 | baidu/ernie-4.5-vl-28b-a3b | Proprietary | 61.8% | 70 | $0.02 | - |
#99 | qwen/qwen3-vl-30b-a3b-instruct | Open | 61.8% | 74 | $0.03 | - |
#100 | qwen/qwen3.5-35b-a3b | Open | 61.2% | 74 | $0.97 | - |
#101 | qwen/qwen3.5-397b-a17b | Open | 61.2% | 74 | $1.36 | - |
#102 | qwen/qwen3.5-flash-02-23 | Open | 60.7% | 71 | $0.13 | - |
#103 | qwen/qwen3.5-122b-a10b | Open | 60.3% | 71 | $2.54 | - |
#104 | ֎ gpt-5-nano | Proprietary | 60.2% | 66 | $0.05 | - |
#105 | x-ai/grok-4-fast | Proprietary | 60.2% | 68 | $0.04 | - |
#106 | qwen/qwen3.6-flash | Open | 60.2% | 71 | $0.61 | - |
#107 | ✦︎ gemma-4-26b-a4b-it | Proprietary | 59.8% | 72 | $0.01 | - |
#108 | ⚛ claude-haiku-4.5 | Proprietary | 59.6% | 71 | $0.18 | - |
#109 | perplexity/sonar-reasoning-pro | Proprietary | 59.6% | 74 | $1.13 | - |
#110 | xiaomi/mimo-v2.5 | Proprietary | 57.3% | 73 | $0.16 | - |
#111 | ֎ gpt-4.1-nano | Proprietary | 55.7% | 65 | $0.03 | - |
#112 | amazon/nova-pro-v1 | Proprietary | 55.1% | 67 | $0.22 | - |
#113 | perplexity/sonar-pro-search | Proprietary | 55.1% | 71 | $1.85 | - |
#114 | x-ai/grok-4.1-fast | Proprietary | 55.1% | 67 | $0.06 | - |
#115 | qwen/qwen3.6-35b-a3b | Open | 53.4% | 70 | $0.54 | - |
#116 | xiaomi/mimo-v2-omni | Proprietary | 52.8% | 64 | $0.18 | - |
#117 | qwen/qwen3.5-27b | Open | 52.5% | 68 | $2.40 | - |
#118 | ✦︎ gemma-3-12b-it | Proprietary | 52.4% | 64 | $0.01 | - |
#119 | 🦙 llama-3.2-11b-vision-instruct | Open | 52.2% | 65 | $0.04 | - |
#120 | mistralai/mistral-medium-3-5 | Open | 51.7% | 67 | $0.28 | - |
#121 | ֎ gpt-5.4-nano | Proprietary | 51.7% | 62 | $0.03 | - |
#122 | qwen/qwen3-vl-8b-instruct | Open | 51.7% | 66 | $0.03 | - |
#123 | amazon/nova-lite-v1 | Proprietary | 50.6% | 66 | $0.02 | - |
#124 | qwen/qwen3-vl-32b-instruct | Open | 50.6% | 64 | $0.08 | - |
#125 | qwen/qwen3.6-27b | Open | 50.6% | 64 | $1.03 | - |
#126 | perplexity/sonar | Proprietary | 49.4% | 69 | $0.60 | - |
#127 | x-ai/grok-4.3 | Proprietary | 49.3% | 52 | $0.49 | - |
#128 | 🦙 llama-4-scout | Open | 48.3% | 62 | $0.03 | - |
#129 | perplexity/sonar-pro | Proprietary | 48.3% | 67 | $1.18 | - |
#130 | qwen/qwen3-vl-8b-thinking | Open | 48.3% | 61 | $0.11 | - |
#131 | mistralai/mistral-medium-3.1 | Open | 47.7% | 65 | $0.07 | - |
#132 | ✦︎ gemma-3-4b-it | Proprietary | 44.9% | 60 | $0.01 | - |
#133 | qwen/qwen3.5-9b | Open | 42.3% | 64 | $0.11 | - |
#134 | mistralai/mistral-small-3.2-24b-instruct | Open | 41.6% | 56 | $0.02 | - |
#135 | mistralai/mistral-large-2512 | Open | 40.4% | 54 | $0.08 | - |
#136 | bytedance/ui-tars-1.5-7b | Open | 38.2% | 55 | $0.02 | - |
#137 | mistralai/ministral-14b-2512 | Open | 36.0% | 51 | $0.03 | - |
#138 | mistralai/ministral-8b-2512 | Open | 32.6% | 48 | $0.03 | - |
#139 | mistralai/mistral-small-3.1-24b-instruct | Open | 32.6% | 54 | $0.05 | - |
#140 | amazon/nova-2-lite-v1 | Proprietary | 31.5% | 52 | $0.07 | - |
#141 | baidu/qianfan-ocr-fast:free | Free | 30.1% | 47 | $0.00 | - |
#142 | nvidia/nemotron-nano-12b-v2-vl:free | Free | 29.4% | 53 | $0.00 | - |
#143 | mistralai/ministral-3b-2512 | Open | 29.2% | 47 | $0.02 | - |
#144 | rekaai/reka-edge | Proprietary | 10.1% | 37 | $0.01 | 3.7s |
#145 | mistralai/mistral-small-2603 | Open | 9.0% | 19 | $0.03 | - |