Bayarea0608 commented on PR #4391:
URL: https://github.com/apache/carbondata/pull/4391#issuecomment-4368264565
i just run demo via : .venv/bin/python examples/carbondata_quickstart.py
Got the below results, looks good.
============================================================
1. ingest_text — three short documents
============================================================
entities 4
chunks 11
embeddings 11
============================================================
2. semantic search — query: 'python web framework'
============================================================
doc:django (score=0.866) Django is a high-level Python web framework that
encourages ...
doc:flask (score=0.866) Flask is a lightweight Python web framework with
a minimal c...
doc:django (score=0.000) It ships with an ORM, an admin panel, and a
templating syste...
============================================================
3. keyword search — query: 'transformer'
============================================================
doc:transformer (BM25=1.344) A transformer is a neural network
architecture built around ...
doc:transformer (BM25=1.171) Modern embedding models and large language
models use the tr...
============================================================
4. hybrid search — query: 'neural training'
============================================================
doc:transformer (RRF=0.0164) It largely replaced earlier recurrent neural
network designs...
doc:transformer (RRF=0.0161) A transformer is a neural network
architecture built around ...
doc:transformer (RRF=0.0159) Modern embedding models and large language
models use the tr...
============================================================
5. ingest_table — users
============================================================
team=ml rows: 2
u3 {'id': 'u3', 'lang': 'Python', 'name': 'Carol',
'team': 'ml'}
u2 {'id': 'u2', 'lang': 'Python', 'name': 'Bob',
'team': 'ml'}
============================================================
6. memory — remember + recall
============================================================
score=0.500 sal=0.9 user prefers django over flask
score=0.354 sal=0.7 user works mainly with python web framework code
score=0.000 sal=0.4 user once asked about a neural transformer model
with min_salience=0.6 2 memories
============================================================
7. graph — relations + traversal
============================================================
neighbors of doc:django (out):
-> doc:flask compares_with (w=0.8)
-> doc:transformer tangentially_about (w=0.2)
traverse from doc:django, max_hops=2:
hop=1 doc:flask
hop=1 doc:transformer
hop=2 doc:recipe
subgraph from {doc:django, doc:flask}, max_hops=1:
entities ['doc:django', 'doc:flask', 'doc:transformer']
relations [('doc:django', 'doc:flask', 'compares_with'),
('doc:django', 'doc:transformer', 'tangentially_about'), ('doc:flask',
'doc:transformer', 'tangentially_about')]
============================================================
8. filter pushdown — kind=document only
============================================================
doc:django (document) Django is a high-level Python web framework that
encourages ...
doc:flask (document) Flask is a lightweight Python web framework with
a minimal c...
doc:django (document) It ships with an ORM, an admin panel, and a
templating syste...
============================================================
9. admin
============================================================
validate.ok (before compact) False
! vector_index[vocab-demo] stale: meta.count=11 != live=18
compact size_before 4096
compact size_after 4096
validate.ok (after compact) True
export bytes 12848
export entities 11
export embeddings 18
============================================================
10. persistence — reopen the file
============================================================
post-reopen top-1 doc:flask
post-reopen recall user prefers django over flask
Done. Demo data lives at:
/var/folders/d3/x_28r1q932g6bq6pxcf8c6rh0000gp/T/carbondata_demo_9rx4uhc0/kb.carbondata
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]