Re: Custom in memory map/reduce using ES data

2015-02-05 Thread Hajime
I use hazelcast on same jvm and run map/reduce in memory.It works really
well.For about 10 blog datas and word count,es request with hz
map/reduce finish in less than 3 seconds.

On Tue, Feb 3, 2015 at 8:30 PM, chengtao cheng chengtaot...@gmail.com
wrote:

 I met the same problem with you !

 在 2014年10月23日星期四 UTC+8上午9:17:18,Hajime Takase写道:

 Hi,

 I have like billion records on 20 nodes and would like to run custom
 map/reduce or aggregation (word count,sentiment analysis,etc) immediately
 after the ES result set is determined.

 I came up with using Plugin system to customize aggregation like this:
 https://github.com/algolia/elasticsearch-cardinality-
 plugin/tree/1.0.X/src/main/java/org/alg/elasticsearch/
 search/aggregations/cardinality

 but want to update the jar quite often which will eventually require ES
 to be reload,I look up the scripted map/ reduce
 http://www.elasticsearch.org/guide/en/elasticsearch/reference/1.4/search-
 aggregations-metrics-scripted-metric-aggregation.html

 but was not sure about the memory usage or customization,I decide to run
 hazelcast or Spark on the same node or jvm and use their map/reduce
 framework.I use Filter phase to put the ES data like this:
 https://github.com/medcl/elasticsearch-filter-redis/
 blob/master/src/main/java/org/elasticsearch/index/query/
 RedisFilterParser.java#L121

 but it just takes quite long time to put data on those in-memory
 middleware...

 Is there any best practice to put ES data to in-memory middleware, just
 to re-use the same data efficiently in subsequent program?
 I don't think I can use the ES query result set (on each shard) which
 seems to be on memory ,in my program,am I right?

 Thanks,

 Haji

  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/e78a0e3e-9958-4744-b4fc-b26b7bb86093%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/e78a0e3e-9958-4744-b4fc-b26b7bb86093%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHm3ZspNp14L4LMZAa0Qjkg3MjOye0UzTmtMsoo8ip-t65etZw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Custom in memory map/reduce using ES data

2015-02-03 Thread chengtao cheng
I met the same problem with you !

在 2014年10月23日星期四 UTC+8上午9:17:18,Hajime Takase写道:

 Hi,

 I have like billion records on 20 nodes and would like to run custom 
 map/reduce or aggregation (word count,sentiment analysis,etc) immediately 
 after the ES result set is determined. 

 I came up with using Plugin system to customize aggregation like this:

 https://github.com/algolia/elasticsearch-cardinality-plugin/tree/1.0.X/src/main/java/org/alg/elasticsearch/search/aggregations/cardinality

 but want to update the jar quite often which will eventually require ES to 
 be reload,I look up the scripted map/ reduce

 http://www.elasticsearch.org/guide/en/elasticsearch/reference/1.4/search-aggregations-metrics-scripted-metric-aggregation.html

 but was not sure about the memory usage or customization,I decide to run 
 hazelcast or Spark on the same node or jvm and use their map/reduce 
 framework.I use Filter phase to put the ES data like this:

 https://github.com/medcl/elasticsearch-filter-redis/blob/master/src/main/java/org/elasticsearch/index/query/RedisFilterParser.java#L121

 but it just takes quite long time to put data on those in-memory 
 middleware...

 Is there any best practice to put ES data to in-memory middleware, just to 
 re-use the same data efficiently in subsequent program?
 I don't think I can use the ES query result set (on each shard) which 
 seems to be on memory ,in my program,am I right?

 Thanks,

 Haji



-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/e78a0e3e-9958-4744-b4fc-b26b7bb86093%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Custom in memory map/reduce using ES data

2014-10-22 Thread Hajime Takase
Hi,

I have like billion records on 20 nodes and would like to run custom
map/reduce or aggregation (word count,sentiment analysis,etc) immediately
after the ES result set is determined.

I came up with using Plugin system to customize aggregation like this:
https://github.com/algolia/elasticsearch-cardinality-plugin/tree/1.0.X/src/main/java/org/alg/elasticsearch/search/aggregations/cardinality

but want to update the jar quite often which will eventually require ES to
be reload,I look up the scripted map/ reduce
http://www.elasticsearch.org/guide/en/elasticsearch/reference/1.4/search-aggregations-metrics-scripted-metric-aggregation.html

but was not sure about the memory usage or customization,I decide to run
hazelcast or Spark on the same node or jvm and use their map/reduce
framework.I use Filter phase to put the ES data like this:
https://github.com/medcl/elasticsearch-filter-redis/blob/master/src/main/java/org/elasticsearch/index/query/RedisFilterParser.java#L121

but it just takes quite long time to put data on those in-memory
middleware...

Is there any best practice to put ES data to in-memory middleware, just to
re-use the same data efficiently in subsequent program?
I don't think I can use the ES query result set (on each shard) which seems
to be on memory ,in my program,am I right?

Thanks,

Haji

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAHm3ZsobDAfy7%3DNXuD0%3DmH12H4haadiFYq25NCz47dfsOkDmmA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.