Re: Create mapping for nested json

2015-04-07 Thread Krishna Raj
What version of ES are you trying on ? I faced this issue due to a bug in
lower versions. But I am successfully when i upgraded to the newer version.

Thanks,
Kr

On Mon, Apr 6, 2015 at 9:42 PM, secs...@gmail.com wrote:

 The culprit seems to be Kibana :(

 I sort of forced ES to show it's hands by explicitly forcing analyzing and
 storing all fields:

 curl -XPUT localhost:9200/_template/metrics -d '{
 template : metrics,
 order:2,
 settings : {
   index.refresh_interval : 5s
 },
 mappings : {
 metric : {
 properties : {
 Activities : {
 type : object,
 properties : {
 ActivityName : {type : string, index :
 analyzed, store : true},
 ActivityFields : {
 type : object,
 properties : {
 FieldName : {type : string, index :
 analyzed, store : true},
 valueCounts : {
type : object,
properties : {
  valueName : {type : string,
 index : analyzed, store : true},
  valueCount : {type : integer,
 index : analyzed, store : true}
   }
 }
 }
 }
 }
 }
 }
 }
 }

   }'

 The resulting JSON in Kibana shows all the extracted fields - only doesn't
 show them as facets!! It discovers them but won't show them as
 facets/aggregates. I can search for /Activities.ActivityName: SSH/ but no
 faceting. Very frustrating. Is there a workaround?





 On Wednesday, April 1, 2015 at 9:46:49 PM UTC-7, sec...@gmail.com wrote:

 Hi,

 Noob at ElasticSearch, I am trying to push some nested json to
 Elasticsearch and have the nested objects parsed out as facets. If I use
 dynamic mapping then elasticsearch does not seem to parse out the internal
 objects. I guess I need to define a mapping for my index?

 Example:

 {
   Date: 2015-03-21T00:09:00,
   Activities: [
 {
   ActivityName: SSH,
   Fields: [
 {
   User: [
 {
   joe: 2,
   jane: 3,
   jack: 5
 }
   ]
 },
 {
   DstIP: [
 {
   HostA: 3,
   HostB: 5,
   HostC: 6
 }
   ]
 }
   ]
 }
   ]
 }

 I tried to follow the mapping documentation but failed to come up with a 
 mapping that represents the JSON above. I guess I am not sure how to map 
 lists. If it helps, here's how I create the JSON in Scala using the Jackson 
 library:

 scala nestedMap
 res3: scala.collection.immutable.Map[String,Object] = Map(Date - 
 2015-03-21T00:09:00, Activities - List(Map(ActivityName - SSH, Fields - 
 List(Map(User - List(Map(joe - 2, jane - 3, jack - 5))), Map(DstIP - 
 List(Map(HostA - 3, HostB - 5, HostC - 6)))

 scala println(Serialization.write(nestedMap))
 {Date:2015-03-21T00:09:00,Activities:[{ActivityName:SSH,Fields:[{User:[{joe:2,jane:3,jack:5}]},{DstIP:[{HostA:3,HostB:5,HostC:6}]}]}]}

 Is there a way to get Jackson to spit out the schema that can be directly 
 fed to elasticsearch as a mapping/template?

 Thanks.



  --
 You received this message because you are subscribed to the Google Groups
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an
 email to elasticsearch+unsubscr...@googlegroups.com.
 To view this discussion on the web visit
 https://groups.google.com/d/msgid/elasticsearch/772aba9c-c85c-4f62-b7fe-d0addd93adcb%40googlegroups.com
 https://groups.google.com/d/msgid/elasticsearch/772aba9c-c85c-4f62-b7fe-d0addd93adcb%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CANH4dajBzbe4A-YCjFSPtjYk2VM%2B7hQga-sZFYNkJ%2B6kNYYstQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Any Elasticsearch hosting vendors that support Couchbase-Elasticsearch transport plugin?

2015-02-25 Thread Raj S
Anyone has a knowledge of such vendors.? . I already talked to one and they 
don't.

-Thanks
Rajesh

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/61fa7dc2-e9ed-4d0a-bc52-b01b7aac5c16%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Elasticsearch performance tuning

2015-02-19 Thread Deva Raj
Hi Mark Walkom,

I have given below logstash conf file

 
  Logstash conf

input {
   file {

  }

}

filter {
  mutate
  {
gsub = [message, \n,  ]
  }
 mutate
 {
gsub = [message, \t,  ]
 }
 multiline
   {
pattern = ^ 
what = previous
   }

grok { match = [ message, 
%{TIME:log_time}\|%{WORD:Message_type}\|%{GREEDYDATA:Component}\|%{NUMBER:line_number}\|
 %{GREEDYDATA:log_message}] 
 match = [ path , 
%{GREEDYDATA}/%{GREEDYDATA:loccode}/%{GREEDYDATA:_machine}\:%{DATE:logdate}.log]

 break_on_match = false
}


#To check location is S or L
  if [loccode] == S  or [loccode] == L {
 ruby {   
code =  temp = event['_machine'].split('_')
  if  !temp.nil? || !temp.empty?
  event['_machine'] = temp[0]
end
   } 
 }
 mutate {

add_field = [event_timestamp, %{@timestamp} ]
replace = [ log_time, %{logdate} %{log_time} ]
# Remove the 'logdate' field since we don't need it anymore.
   lowercase=[loccode]
   remove = logdate

  }
# to get all site details (site name, city and co-ordinates)
sitelocator{sitename = loccode  
datafile=vendor/sitelocator/SiteDetails.csv}
date {  locale=en
match = [ log_time, -MM-dd HH:mm:ss, MM-dd- 
HH:mm:ss.SSS,ISO8601 ] }

}

output {
elasticsearch{
 }

}



I have checked step by step to find bottleneck filter. Below filter which 
took much time. Can you guide me How can I tune it to get faster. 

date { locale=en match = [ log_time, -MM-dd HH:mm:ss, 
MM-dd- HH:mm:ss.SSS,ISO8601 ] } } 
http://serverfault.com/questions/669534/elasticsearch-performance-tuning#comment818613_669558


Thanks
Devaraj

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/7eedf369-b10d-442e-b30d-5e7969bf1c59%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Elasticsearch performance tuning

2015-02-19 Thread Deva Raj

I listed below instance and his heap size  details.

Medium instance 3.75 RAM 1 cores Storage :4 GB SSD 64-bit Network  

Java heap size: 2gb


R3 Large 15.25 RAM 2 cores Storage :32 GB SSD

Java heap size: 7gb


R3 High-Memory Extra Large r3.xlarge 30.5 RAM 4 cores

Java heap size: 15gb


Thanks
Devaraj

On Friday, February 20, 2015 at 4:15:12 AM UTC+5:30, Mark Walkom wrote:

 Don't change cache and buffer sizes unless you know what is happening, the 
 defaults are going to be fine.
 How much heap did you give ES?

 I'm not sure you can do much about the date filter though, maybe someone 
 else has pointers.

 On 19 February 2015 at 21:12, Deva Raj devara...@gmail.com javascript: 
 wrote:

 Hi Mark Walkom,

 I have given below logstash conf file

  
   Logstash conf

 input {
file {

   }

 }

 filter {
   mutate
   {
 gsub = [message, \n,  ]
   }
  mutate
  {
 gsub = [message, \t,  ]
  }
  multiline
{
 pattern = ^ 
 what = previous
}

 grok { match = [ message, 
 %{TIME:log_time}\|%{WORD:Message_type}\|%{GREEDYDATA:Component}\|%{NUMBER:line_number}\|
  %{GREEDYDATA:log_message}] 
  match = [ path , 
 %{GREEDYDATA}/%{GREEDYDATA:loccode}/%{GREEDYDATA:_machine}\:%{DATE:logdate}.log]

  break_on_match = false
 }


 #To check location is S or L
   if [loccode] == S  or [loccode] == L {
  ruby {   
 code =  temp = event['_machine'].split('_')
   if  !temp.nil? || !temp.empty?
   event['_machine'] = temp[0]
 end
} 
  }
  mutate {

 add_field = [event_timestamp, %{@timestamp} ]
 replace = [ log_time, %{logdate} %{log_time} ]
 # Remove the 'logdate' field since we don't need it anymore.
lowercase=[loccode]
remove = logdate

   }
 # to get all site details (site name, city and co-ordinates)
 sitelocator{sitename = loccode  
 datafile=vendor/sitelocator/SiteDetails.csv}
 date {  locale=en
 match = [ log_time, -MM-dd HH:mm:ss, MM-dd- 
 HH:mm:ss.SSS,ISO8601 ] }

 }

 output {
 elasticsearch{
  }

 }



 I have checked step by step to find bottleneck filter. Below filter which 
 took much time. Can you guide me How can I tune it to get faster. 

 date { locale=en match = [ log_time, -MM-dd HH:mm:ss, 
 MM-dd- HH:mm:ss.SSS,ISO8601 ] } } 
 http://serverfault.com/questions/669534/elasticsearch-performance-tuning#comment818613_669558


 Thanks
 Devaraj

 -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com javascript:.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/7eedf369-b10d-442e-b30d-5e7969bf1c59%40googlegroups.com
  
 https://groups.google.com/d/msgid/elasticsearch/7eedf369-b10d-442e-b30d-5e7969bf1c59%40googlegroups.com?utm_medium=emailutm_source=footer
 .

 For more options, visit https://groups.google.com/d/optout.




-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/5335d517-d7d6-482f-a4b4-6ab06eb13e02%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Elasticsearch performance tuning

2015-02-18 Thread Deva Raj
Hi All,

 In a Single Node Elastic Search along with logstash, We tested with 20mb 
and 200mb file parsing to Elastic Search on Different types of the AWS 
instance i.e Medium, Large and Xlarge.

Environment Details : Medium instance 3.75 RAM  1 cores Storage :4 GB SSD 
64-bit Network Performance: Moderate 
Instance running with : Logstash, Elastic search

Scenario: 1

**With default settings** 
Result :
20mb logfile 23 mins Events Per/second 175
200mb logfile 3 hrs 3 mins Events Per/second 175


Added the following to settings:
Java heap size : 2GB
bootstrap.mlockall: true
indices.fielddata.cache.size: 30%
indices.cache.filter.size: 30%
index.translog.flush_threshold_ops: 5
indices.memory.index_buffer_size: 50%

# Search thread pool
threadpool.search.type: fixed
threadpool.search.size: 20
threadpool.search.queue_size: 100

**With added settings** 
Result:
20mb logfile 22 mins Events Per/second 180
200mb logfile 3 hrs 07 mins Events Per/second 180

Scenario 2

Environment Details : R3 Large 15.25 RAM  2 cores Storage :32 GB SSD 
64-bit Network Performance: Moderate 
Instance running with : Logstash, Elastic search

**With default settings** 
Result :
  20mb logfile 7 mins Events Per/second 750
  200mb logfile 65 mins Events Per/second 800

Added the following to settings:
Java heap size: 7gb
other parameters same as above

**With added settings** 
Result:
20mb logfile 7 mins Events Per/second 800
200mb logfile 55 mins Events Per/second 800

Scenario 3

Environment Details : 
R3 High-Memory Extra Large r3.xlarge 30.5 RAM 4 cores Storage :32 GB SSD 
64-bit Network Performance: Moderate 
Instance running with : Logstash, Elastic search

**With default settings** 
  Result:
  20mb logfile 7 mins Events Per/second 1200
  200mb logfile 34 mins Events Per/second 1200

 Added the following to settings:
Java heap size: 15gb
other parameters same as above

**With added settings** 
Result:
20mb logfile 7 mins Events Per/second 1200
200mb logfile 34 mins Events Per/second 1200

I wanted to know

1. What is the benchmark for the performance?
2. Is the performance meets the benchmark or is it below the benchmark
3. Why even after i increased the elasticsearch JVM iam not able to find 
the difference?
4. how do i monitor Logstash and improve its performance?

appreciate any help on this as iam new to logstash and elastic search. 

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/f5b136c9-de21-4f0c-ba78-d8146376f307%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Elasticsearch performance tuning

2015-02-18 Thread Deva Raj
Hi   Mark Walkom,

Thanks mark and i miss anything to tuning performance of elasticsearch.

  Added the following to elasticsearch settings:
Java heap size : Half of physical memory
bootstrap.mlockall: true
indices.fielddata.cache.size: 30%
indices.cache.filter.size: 30%
index.translog.flush_
threshold_ops: 5
indices.memory.index_buffer_size: 50%


On Thursday, February 19, 2015 at 7:25:27 AM UTC+5:30, Mark Walkom wrote:

 1. It depends
 2. It depends
 3. It depends
 4. It also depends.

 The performance of ES is dependent on you; your data, your use, your 
 queries, your hardware, your configuration. If that is the results you got 
 then it is indicative to your setup and thus is your benchmark, and from 
 there you can tweak and try to improve performance.

 Monitoring LS is a little harder as there are no APIs for it (yet). Most 
 of the performance of it will result on your filters (especially grok).




-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/54a9031b-1e73-42b7-92b9-7ae3bda46ee7%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: not able to refine from o/p of query in logstash

2015-02-02 Thread raj@
can some put some light on my problem. I am sorry to bump this thread again.
I am not able to get any idea if we have any option that we can call another
script within my logstash query to get on my requirements.

Please let me know if anything is not clear.

Thanks In advance !




--
View this message in context: 
http://elasticsearch-users.115913.n3.nabble.com/not-able-to-refine-from-o-p-of-query-in-logstash-tp4069573p4069952.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1422904061306-4069952.post%40n3.nabble.com.
For more options, visit https://groups.google.com/d/optout.


search users based on degree of connections - for typeahead functionality

2015-01-30 Thread Raj S
I have implemented term search based on nGrams/Filters/tokenizers for 
typeahead functionality . Where user types parts of name and it brings up 
users matching to the typed text.

But I have requirement around data modeling and implementation,  for when 
someone (user A) searches other users . It should bring users matching to 
his/her term based on following order. 

- People whom I follow
- People who follow me
- Everyone else.

Has anyone solved this problem using elastic search?. If yes what should be 
the data model and mappings?

-Thanks
Rajesh

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/9b34b088-8e25-4119-a252-32a9ccbcb751%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: not able to refine from o/p of query in logstash

2015-01-29 Thread raj@

Can anyone help me on this problem, please !



--
View this message in context: 
http://elasticsearch-users.115913.n3.nabble.com/not-able-to-refine-from-o-p-of-query-in-logstash-tp4069573p4069775.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1422522129753-4069775.post%40n3.nabble.com.
For more options, visit https://groups.google.com/d/optout.


not able to refine from o/p of query in logstash

2015-01-27 Thread raj@
I am using the below query to pull the information from logstash:: 

curl -XGET ' http://logs:xx00/_all/_search?pretty=true' -d ' { 
query: { 
bool: { 
must: [ 
{ 
match: { 
_type: pre 
} 
}, 
{ 
match: { 
message: MapDone 
} 
}, 
{ 
   range: { 
@timestamp: { 
gte: now-5m 
} 

} 
} 
] 
} 
} }' 
Output :: 

{ took : 177, timed_out : false, _shards : { total : 3225,
successful : 3225, failed : 0 }, hits : { total : 1238, max_score
: 4.3801584, hits : [ { _index : fi-logstash-2015.01.21, _type :
fi, _id : CORYzNPHnnQeu09A, _score : 4.3801584,
_source:{thread_name:main,message:[MapDone]\tstandards.po.poRsxWrite
in
169ms,@timestamp:2015-01-21T14:48:59.835+00:00,level:INFO,mdc:{},file:fi-1-small-log.json,class:fi.log.MapLogHandler,line_number:21,logger_name:fi.Mapper,method:info,@version:1,source_host:fi.pp,host:prefi2,offset:185244882,type:prefi,tags:[instance],syslog_severity_code:5,syslog_facility_code:1,syslog_facility:user-level,syslog_severity:notice}
} 

The above is only a part of the output.I am trying to get only the map name
as output. When I am trying , I am getting errors. 

Different sample Maps:: formats.pure.qm.fromSIP.toCSV.write in 24ms
H044Grain.hub.asn.from.advanceShipNoticeWrite in 188ms
H9B1honey.hub.po.fromFEDSto.purchaseOrder in 416ms
HAEPrugs.hub.rsx.v7.r0.po.poFedsWrite in 231ms
H4Grain2.hub.in.fromtoAPP.invoiceWrite in 110ms
H2Home.v700.e4060.co.in.inFedsWrite in 108ms 

I am tring to get:: 

1 - only mapping names ( H4Grain2.hub.in.from.invoiceWrite ) 
2 - unique mappings ( something like | uniq to previous o/p ) 
3 - Average of last 1 minutes mappings 

Can anybody help check if this is possible. Thanks a ton in advance.
http://logs:xx00/_all/_search?pretty=true' 



--
View this message in context: 
http://elasticsearch-users.115913.n3.nabble.com/not-able-to-refine-from-o-p-of-query-in-logstash-tp4069573.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1422265133181-4069573.post%40n3.nabble.com.
For more options, visit https://groups.google.com/d/optout.


Re: not able to refine from o/p of query in logstash

2015-01-27 Thread raj@

can anyone help with this.just bumping this email. sorry if I am breaking
any 



--
View this message in context: 
http://elasticsearch-users.115913.n3.nabble.com/not-able-to-refine-from-o-p-of-query-in-logstash-tp4069573p4069621.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1422333875969-4069621.post%40n3.nabble.com.
For more options, visit https://groups.google.com/d/optout.


Re: Crying for help:: MapperParsingException when trying to create index with mapping

2015-01-11 Thread Krishna Raj
Hi Masaru,

Thank You. 

It did not work. Because we were using 1.3.x version of elasticsearch and 
there is bug for solaris for mapping nested object.

I upgraded ES and the mapping is good now.

Thanks,
Krishna Raj

On Tuesday, January 6, 2015 at 2:00:01 PM UTC-8, Krishna Raj wrote:

 Hi,

 I am trying to create an index with mapping which contains nested object. 
 I also tried to update the mapping of an empty created index with the same 
 below mapping which contains nested object.

 But I am getting MapperParsingException error all time. My cluster goes 
 down and recoveres automatically. I am banging my head on this for last 1 
 week with no help.

 Any help is greatly appreciated.

 Reference: 
 http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-nested-type.html

 *My Sample JSON: *

 {
 timeStamp: 2014-12-31T19:15:45.000+,
 metrics: [
 {
 name: viewList,
 ave: 10.5
 },
 {
 name: checkout,
 ave: 20.5
 },
 {
 name: login,
 ave: 30.5
 },
 {
 name: logout,
 ave: 40.5
 }
 ]
 }


 *Mapping I am trying:*

 curl -XPUT 'http://myhost:9201/testagg/testagg/_mapping' -d '{
 testagg: {
 properties: {
 timeStamp: {
 format: dateOptionalTime,
 type: date
 },
 properties: {
 metrics: {
 type: nested,
 properties: {
 name: {
 type: string
 },
 ave: {
 type: double
 }
 }
 }
 }
 }
 }
 }'


 Thanks,
 Krishna Raj


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/8503d1a7-b951-4550-9da5-a21a519a59f9%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: What happens to data in an existing type if we update the mapping to specify 'path's for _id and _routing

2014-10-28 Thread Preeti Raj - Buchhada
Any ideas on this?


On Monday, October 27, 2014 3:03:16 PM UTC+5:30, Preeti Raj - Buchhada 
wrote:

 We are using ES 1.3.2.
 We have a need to specify custom id and routing values when indexing.
 We've been doing this using Java APIs, however we would now like to update 
 the mapping to specify 'path's for _id and _routing.

 The question we have is:
 1) Since this type already has a huge number of documents, can we change 
 the mapping? When we tried it, we got a 'acknowledged: true' response, 
 but it doesn't seem to be working when we tried indexing.
 2) In case there is a way to achieve this, will it affect only the new 
 documents being indexed?



-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1b912d68-900f-4f6f-be5f-cbae83776e1b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


What happens to data in an existing type if we update the mapping to specify 'path's for _id and _routing

2014-10-27 Thread Preeti Raj - Buchhada
We are using ES 1.3.2.
We have a need to specify custom id and routing values when indexing.
We've been doing this using Java APIs, however we would now like to update 
the mapping to specify 'path's for _id and _routing.

The question we have is:
1) Since this type already has a huge number of documents, can we change 
the mapping? When we tried it, we got a 'acknowledged: true' response, 
but it doesn't seem to be working when we tried indexing.
2) In case there is a way to achieve this, will it affect only the new 
documents being indexed?

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/1c5516bd-7738-4969-8bee-b979aa89b65b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Is there a way to update ES records using Spark?

2014-10-13 Thread Preeti Raj - Buchhada
Anyone has an idea?
At least if I get to know whether this is possible or not, that'll be a 
 great help.

Thanks.



On Wednesday, October 1, 2014 3:46:51 PM UTC+5:30, Preeti Raj - Buchhada 
wrote:

 I am using ES version 1.3.2, and Spark 1.1.0.
 I can successfully read and write records from/to ES using newAPIHadoopRDD() 
 and saveAsNewAPIHadoopDataset().
 However, I am struggling to find a way to update records. Even I specify a 
 'key' in ESOutputFormat it gets ignored, as documented clearly.
 So my question is : Is there a way to specify document ID and custom 
 routing values when writing to ES using Spark? If yes, how?


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/b6e4628a-5106-4f2b-997d-e790a8aeb455%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Is there a way to update ES records using Spark?

2014-10-13 Thread Preeti Raj - Buchhada
Thanks for your reply Costin.
However, we have a need to compute a custom ID based on concatenation of 
multiple fields values and then computing the hash value. So simply 
specifying 'es.mapping.id' will not help in our case.

Is there any other way?


On Monday, October 13, 2014 4:08:05 PM UTC+5:30, Costin Leau wrote:

 You can the mapping options [1], namely `es.mapping.id` to specify the 
 id field of your documents. 

 [1] 
 http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/master/configuration.html#cfg-mapping
  

 On Mon, Oct 13, 2014 at 12:55 PM, Preeti Raj - Buchhada 
 pbuc...@gmail.com javascript: wrote: 
  Anyone has an idea? 
  At least if I get to know whether this is possible or not, that'll be a 
  great help. 
  
  Thanks. 
  
  
  
  On Wednesday, October 1, 2014 3:46:51 PM UTC+5:30, Preeti Raj - Buchhada 
  wrote: 
  
  I am using ES version 1.3.2, and Spark 1.1.0. 
  I can successfully read and write records from/to ES using 
  newAPIHadoopRDD() and saveAsNewAPIHadoopDataset(). 
  However, I am struggling to find a way to update records. Even I 
 specify a 
  'key' in ESOutputFormat it gets ignored, as documented clearly. 
  So my question is : Is there a way to specify document ID and custom 
  routing values when writing to ES using Spark? If yes, how? 
  
  -- 
  You received this message because you are subscribed to the Google 
 Groups 
  elasticsearch group. 
  To unsubscribe from this group and stop receiving emails from it, send 
 an 
  email to elasticsearc...@googlegroups.com javascript:. 
  To view this discussion on the web visit 
  
 https://groups.google.com/d/msgid/elasticsearch/b6e4628a-5106-4f2b-997d-e790a8aeb455%40googlegroups.com.
  

  
  For more options, visit https://groups.google.com/d/optout. 


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/fc10e3fb-158b-4ae2-9117-beea8a620865%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Is there a way to update ES records using Spark?

2014-10-01 Thread Preeti Raj - Buchhada
I am using ES version 1.3.2, and Spark 1.1.0.
I can successfully read and write records from/to ES using newAPIHadoopRDD() 
and saveAsNewAPIHadoopDataset().
However, I am struggling to find a way to update records. Even I specify a 
'key' in ESOutputFormat it gets ignored, as documented clearly.
So my question is : Is there a way to specify document ID and custom 
routing values when writing to ES using Spark? If yes, how?

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/529fb146-bea8-48ac-aed0-d6908775f85d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: ES Plugin to extend Lucene's Standard Tokenizer

2014-09-09 Thread Raj Gupta
Hi Vineeth,
I haven't looked at the plugin Bryan has created ,
However creating a plugin for special characters gives better performance 
over patter tokenizer or custom filters.
Regards,
Raj

On Tuesday, September 9, 2014 9:06:08 AM UTC+5:30, vineeth mohan wrote:

 Hello Bryan ,

 Congrats on your first plugin. 
 I have a question here - Can you implement the whole plugin by using 
 http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/analysis-pattern-tokenizer.html
  
 tokenizer ? 

 Is your plugin providing any advantage over going this approach ?

 Thanks
   Vineeth

 On Tue, Sep 9, 2014 at 7:56 AM, Bryan Warner bryan@gmail.com 
 javascript: wrote:

 Hi all,

 Recently, I've been working on an extension to Lucene's Standard 
 Tokenizer that allows the user to customize / override the default word 
 boundary break rules for Unicode characters. The Standard Tokenizer 
 implements the word break rules from the Unicode Text segmentation 
 http://www.unicode.org/reports/tr29/ algorithm where most punctuation 
 symbols (except for underscore '_') are treated as hard word breaks (e.g. 
 @foo , #foo are tokenized to foo). While the Standard Tokenizer works 
 great in most cases, I found that being unable to override the default word 
 break rules was quite limiting especially since a lot of these punctuation 
 symbols have important meaning now on the web (@ - mentions, # - hashtags, 
 etc.)

 I've wrapped this extension to the Standard Tokenizer in an ElasticSearch 
 plugin, which can be found at - 
 https://github.com/bbguitar77/elasticsearch-analysis-standardext ... 
 definitely looking for feedback as this is my first go at an ElasticSearch 
 plugin!

 I'm hoping other ElasticSearch / Lucene users find this helpful.

 Cheers!
 Bryan

  -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com javascript:.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/929dc7c3-ff99-43a4-a287-1a8f89d86e3f%40googlegroups.com
  
 https://groups.google.com/d/msgid/elasticsearch/929dc7c3-ff99-43a4-a287-1a8f89d86e3f%40googlegroups.com?utm_medium=emailutm_source=footer
 .
 For more options, visit https://groups.google.com/d/optout.




-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/dd2fd3b0-f6c1-40e0-b2d7-723084027354%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Pattern_capture filter emits a token that is not matched with pattern also.

2014-08-14 Thread Raj
 I have a case where I have to extract domain part from emails that are 
found in a text. I used uax_url_email tokenizer to create emails as a 
single. And I have a pattern_capture filter which will emit @(.+) pattern 
string. But uax_url_email also return words also which is not an email and 
the pattern capture filter does not filter that. Any suggestions?

custom_analyzer:{
 tokenizer: uax_url_email,
  filter: [
   email_domain_filter
   ]
}
filter: {
  email_domain_filter:{
   type: pattern_capture,
   preserve_original: false,
patterns: [
  @(.+)
  ]
   }
}

*input string* : my email id is x...@gmail.com
*Output tokens:*  my, email, id, is, gmail.com

But I need only gmail.com

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/3de51758-bb99-46c6-b47c-a68004de8eb8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.