Manage your all of your contacts at scale

Manage your all of your contacts at scale

Short description goes here so it complements the heading.

Table of contents
Listen to article Loading…
Share

Video

test

---

Video BG

Wide Image

Test

Booking.com team photo

Table 1

LabelDetails
Example AShort description for A.
Example BLonger description that wraps; no horizontal scroll.
Example CAnother line with more content to test wrapping.

Table 2

Model Evaluation Scores

Solution Challenge it solves How it works
Smart Filters Traditional search relied on drop-down menus and checkboxes, limiting travelers to a small number of filters. Uses GPT-4o mini to understand prompts like “sunset views” or “great gym.”

Goes beyond predefined filters by analyzing reviews, images, and listing details.

Surfaces more relevant results, driving engagement and conversions.
Property Q&A Many travelers have specific questions about properties that aren’t easily answered in a static listing. OpenAI’s LLMs were fine-tuned on user content and property descriptions.

Handles queries like “Is there a crib available?” or “Is the pool open in winter.”

Adapts to ambiguity in pet-policy definitions.
AI Review Summaries Travelers often struggle to sift through thousands of reviews when comparing properties. GPT-4o mini analyzes and summarizes reviews into themes (cleanliness, location, amenities).

Generates concise summaries, speeding decisions and boosting confidence.
Help Me Reply Manages guest communications efficiently and cuts response times. Auto-generates responses and templates via OpenAI’s models.

Hosts track replies with a reply-score metric.

Table 3

Model Evaluation Scores

GPT-4.5 GPT-4o
GPQA (science)71.4%53.6%
AIME ‘24 (math)36.7%9.3%
MMMLU (multilingual)85.1%81.5%
MMMU (multimodal)74.4%69.1%
SWE-Lancer Diamond (coding)* 32.6%

$186,125
23.3%

$138,750
SWE-Bench Verified (coding)*38.0%30.7%

* Numbers shown represent best internal performance.

Table 4

Model Evaluation: GPT-4 Comparison

GPT-4.5 GPT-4o Baseline
GPQA (science)71.4%53.6%50.0%
AIME ‘24 (math)36.7%9.3%25.0%
MMMLU (multilingual)85.1%81.5%70.0%
MMMU (multimodal)74.4%69.1%60.0%
SWE-Lancer Diamond (coding)*32.6%23.3%10.0%
SWE-Bench Verified (coding)*38.0%30.7%20.0%

* Numbers shown represent best internal performance.

Block Quotes:

This doesn't look anything like what I want

Video

test

---

Video BG

Wide Image

Booking.com team photo

Table 1

LabelDetails
Example AShort description for A.
Example BLonger description that wraps; no horizontal scroll.
Example CAnother line with more content to test wrapping.

Table 2

Booking.com Flagship AI Solutions

Solution Challenge it solves How it works
Smart Filters Traditional search relied on drop-down menus and checkboxes, limiting travelers to a small number of filters. Uses GPT-4o mini to understand prompts like “sunset views” or “great gym.”

Goes beyond predefined filters by analyzing reviews, images, and listing details.

Surfaces more relevant results, driving engagement and conversions.
Property Q&A Many travelers have specific questions about properties that aren’t easily answered in a static listing. OpenAI’s LLMs were fine-tuned on user content and property descriptions.

Handles queries like “Is there a crib available?” or “Is the pool open in winter.”

Adapts to ambiguity in pet-policy definitions.
AI Review Summaries Travelers often struggle to sift through thousands of reviews when comparing properties. GPT-4o mini analyzes and summarizes reviews into themes (cleanliness, location, amenities).

Generates concise summaries, speeding decisions and boosting confidence.
Help Me Reply Manages guest communications efficiently and cuts response times. Auto-generates responses and templates via OpenAI’s models.

Hosts track replies with a reply-score metric.

Table 3

Model Evaluation Scores

GPT-4.5 GPT-4o
GPQA (science)71.4%53.6%
AIME ‘24 (math)36.7%9.3%
MMMLU (multilingual)85.1%81.5%
MMMU (multimodal)74.4%69.1%
SWE-Lancer Diamond (coding)* 32.6%

$186,125
23.3%

$138,750
SWE-Bench Verified (coding)*38.0%30.7%

* Numbers shown represent best internal performance.

Table 4

Model Evaluation: GPT-4 Comparison

GPT-4.5 GPT-4o Baseline
GPQA (science)71.4%53.6%50.0%
AIME ‘24 (math)36.7%9.3%25.0%
MMMLU (multilingual)85.1%81.5%70.0%
MMMU (multimodal)74.4%69.1%60.0%
SWE-Lancer Diamond (coding)*32.6%23.3%10.0%
SWE-Bench Verified (coding)*38.0%30.7%20.0%

* Numbers shown represent best internal performance.

It works fine...

Block Quotes:

This doesn't look anything like what I want

Frequently asked questions

Short question goes here

Short heading goes here