[ 
https://issues.apache.org/jira/browse/IMPALA-12467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sreenath updated IMPALA-12467:
------------------------------
    Description: 
Semantic search is a way for computers to understand the meaning behind words 
and phrases when you're searching for something. Instead of just looking for 
exact matches of keywords, it tries to figure out what you're really asking and 
provides results that are more relevant and meaningful to your question. It's 
like having a search engine that can understand what you mean, not just what 
you say, making it easier to find the information you're looking for. This 
ticket is a wish to have semantic search in Impala.

On the implementation side, semantic search uses an embedding model and any of 
the similarity distance functions.

My proposal is to implement functions for on-the-fly calculation of similarity 
distance between two values. Once we have them we could easily do semantic 
search as part of a where clause.
 * Eg (using a cosine similarity function): “WHERE cos_dist(region, 'europe') > 
0.9“. And it could return records with regions like Scandinavia, Nordic, Baltic 
etc…
 * We could have functions thats accept values as text or as vector embeddings.

  was:
Semantic search is a way for computers to understand the meaning behind words 
and phrases when you're searching for something. Instead of just looking for 
exact matches of keywords, it tries to figure out what you're really asking and 
provides results that are more relevant and meaningful to your question. It's 
like having a search engine that can understand what you mean, not just what 
you say, making it easier to find the information you're looking for. This 
ticket is a wish to have Semantic search in ImpalaSemantic Search In Hive.

On the implementation side, semantic search uses an embedding model and any of 
the similarity distance functions. 

My proposal is to implement functions for on-the-fly calculation of similarity 
distance between two values. Once we have them we could easily do semantic 
search as part of a where clause.
 * Eg (using a cosine similarity function): “WHERE cos_dist(region, 'europe') > 
0.9“. And it could return records with regions like Scandinavia, Nordic, Baltic 
etc…
 * We could have functions thats accept values as text or as vector embeddings.


> Semantic Search In Impala
> -------------------------
>
>                 Key: IMPALA-12467
>                 URL: https://issues.apache.org/jira/browse/IMPALA-12467
>             Project: IMPALA
>          Issue Type: Wish
>            Reporter: Sreenath
>            Priority: Major
>
> Semantic search is a way for computers to understand the meaning behind words 
> and phrases when you're searching for something. Instead of just looking for 
> exact matches of keywords, it tries to figure out what you're really asking 
> and provides results that are more relevant and meaningful to your question. 
> It's like having a search engine that can understand what you mean, not just 
> what you say, making it easier to find the information you're looking for. 
> This ticket is a wish to have semantic search in Impala.
> On the implementation side, semantic search uses an embedding model and any 
> of the similarity distance functions.
> My proposal is to implement functions for on-the-fly calculation of 
> similarity distance between two values. Once we have them we could easily do 
> semantic search as part of a where clause.
>  * Eg (using a cosine similarity function): “WHERE cos_dist(region, 'europe') 
> > 0.9“. And it could return records with regions like Scandinavia, Nordic, 
> Baltic etc…
>  * We could have functions thats accept values as text or as vector 
> embeddings.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to