Re: elastisticsearch_dsl python to create pivot tables

Honza Král Mon, 30 Mar 2015 14:07:07 -0700

Hello,

you can access buckets already created using ['name'] syntax, in your case
you can do (instead of the chaining):


s.aggs['xColor']['xMake']['xCity'].metric(...)
s.aggs['xColor']['xMake']['xCity'].metric(...)

This way you can add aggregations to already created buckets.

Also you can just use an approach where you keep the pointer to the
inner-most bucket 9start with s.aggs) and go from there in your case (bunch
of nested buckets and then metrics inside):

b = s.aggs
for bucket in xVarBuckets:
    b = s.aggs.bucket(bucket['label'], 'terms', field=bucket['field'])

for metric in xVarMetrics:
    b.metric(metric['label'], metric['agg_function'], field=metric['field'])


Hope this helps,

On Mon, Mar 30, 2015 at 10:55 PM, Mike <almug...@googlemail.com> wrote:

> the python elasticsearch , elasticsearch dsl packages are life-saver and
> got me converted to ES.
>
> Now I am trying to use elasticsearch dsl package to create pivot tables in
> ES  but am having hard time figuring out how to chain the buckets
> programmatically.
> while chaining the buckets / metrics manually works,  to chain them
> programmatically seems impossible
>
> here is an example
>
>
> from elasticsearch import Elasticsearch
> from elasticsearch_dsl import Search as dsl_search, A, Q, F
> # create client
> es = Elasticsearch('localhost:9200')
> # data : from the definitive guide, slighlty modified
> #data from the definitive guide
> xData = [
> {'doc_id' : 1, 'price' : 10000, 'color' : 'red',   'make' : 'honda',
> 'sold' : '2014-10-28', 'city': 'ROME',   'insurance': 'y'},
> {'doc_id' : 2, 'price' : 20000, 'color' : 'red',   'make' : 'honda',
> 'sold' : '2014-11-05', 'city': 'ROME',   'insurance': 'n'},
> {'doc_id' : 3, 'price' : 30000, 'color' : 'green', 'make' : 'ford',
> 'sold' : '2014-05-18', 'city': 'Berlin', 'insurance': 'y'},
> {'doc_id' : 4, 'price' : 15000, 'color' : 'blue',  'make' : 'toyota',
> 'sold' : '2014-07-02', 'city': 'Berlin', 'insurance': 'n'},
> {'doc_id' : 5, 'price' : 12000, 'color' : 'green', 'make' : 'toyota',
> 'sold' : '2014-08-19', 'city': 'Berlin', 'insurance': 'n'},
> {'doc_id' : 6, 'price' : 20000, 'color' : 'red',   'make' : 'honda',
> 'sold' : '2014-11-05', 'city': 'Paris',  'insurance': 'n'},
> {'doc_id' : 7, 'price' : 80000, 'color' : 'red',   'make' : 'bmw',
> 'sold' : '2014-01-01', 'city': 'Paris',  'insurance': 'y'},
> {'doc_id' : 8, 'price' : 25000, 'color' : 'blue',  'make' : 'ford',
> 'sold' : '2014-02-12', 'city': 'Paris',  'insurance': 'y'}]
>
> #create a mapping
> my_mapping = {
>     'my_example': {
>         'properties': {
>         'doc_id': {'type': 'integer'},
>         'price': {'type': 'integer'},
>          'color': {'type': 'string', 'index': 'not_analyzed'},
>          'make': {'type': 'string', 'index': 'not_analyzed'},
>          'city': {'type': 'string', 'index': 'not_analyzed'},
>          'insurance': {'type': 'string', 'index': 'not_analyzed'},
>          'sold': {'type': 'date'}
> }}}
>
>
> #create an index and add the mapping
> if es.indices.exists('my_index_test'):
>     es.indices.delete(index="my_index_test")
> es.indices.create('my_index_test')
>
> # mapping for the document type
> if es.indices.exists_type(index = 'my_index_test', doc_type =
> 'my_example'):
>     es.indices.delete_mapping(index='my_index_test',doc_type='my_example')
>
> es.indices.put_mapping(index='my_index_test',doc_type='my_example',body=my_mapping)
>
> # indexing
> for xRow in xData:
>     es.index(index = 'my_index',
>              doc_type= 'my_example',
>              id = xRow['doc_id'],
>              body = xRow
>              )
>
>
> ### MANUALLY CHAINING WORKS
>
> a = A('terms', field = 'color')
> b = A('terms', field = 'make')
> c = A('terms', field = 'city')
>
> s1 = dsl_search(es, index = 'my_index', doc_type= 'my_example')
> s1.aggs.bucket('xColor', a).bucket('xMake', b).bucket('xCity', c)\
>                           .metric('xMyPriceSum', 'sum', field = 'price')\
>                           .metric('xMyPriceAvg', 'avg', field = 'price')
> resp = s1.execute()
> #get results
> q1 = resp.aggregations
> q1
>
>
>
> #### but not PROGRAMMATICALLY
> # Programmatically chaining
>
> xVarBuckets = [{'field': 'color', 'label': 'xColor'},
>                {'field': 'make',  'label': 'xMake'},
>                {'field': 'city',  'label': 'xCity'}]
>
> xVar_Metrics = [{'field': 'price', 'agg_function': 'sum', 'label':
> 'xMyPriceSum'},
>                 {'field': 'price', 'agg_function': 'avg', 'label':
> 'xMyPriceAvg'}]
>
>
> s2 = None
> s2 = dsl_search(es, index = 'my_index', doc_type = 'my_example')
>
> #add buckets
> for xBucketVar in xVarBuckets:
>     xAgg = A('terms', field= xBucketVar['field'])
>     s2.aggs.bucket(xBucketVar['label'], xAgg)
> resp2 = s2.execute()
> #get results
> q2 = resp2.aggregations
>
>
> I guess it has to do with the fact that the newly create bucket is
> overwritten by the new bucket, but how can append the new bucket to the
> previous one
>
>
> Any help appreciated
>
>
>
>
>
>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to elasticsearch+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/fa471fcf-9ed7-49f9-9e34-4cbefb90abb8%40googlegroups.com
> <https://groups.google.com/d/msgid/elasticsearch/fa471fcf-9ed7-49f9-9e34-4cbefb90abb8%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>



-- 
Honza Král
Python Engineer
honza.k...@elastic.co

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAC4VrtwUQQJrwj6VoCimMeoi%2Bpt5WQjwSFx%3DOP5p2kSJRHDcXQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: elastisticsearch_dsl python to create pivot tables

Reply via email to