How to treat sorting for string fields as not_analyzed

2015-03-26 Thread pulkitsinghal
Only when performing a sort operation, I would like to treat a string field 
like 
{name: "first last"}

as not_analyzed ... is this possible?

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/b02b4a7b-7505-4a30-be81-c727898c18b3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Disabling dynamic mapping

2014-11-10 Thread pulkitsinghal
What does the json in the CURL request for this look like?

The dynamic creation of mappings for unmapped types can be completely 
disabled by setting *index.mapper.dynamic* to false.
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-dynamic-mapping.html#mapping-dynamic-mapping

Thanks!
- Pulkit

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/a9131f0a-3a3d-4617-96ac-d77c12d9d48c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Large results sets and paging for Aggregations

2014-11-09 Thread pulkitsinghal
Sharing a response I received from Igor Motov:

"scroll works only to page results. paging aggs doesn't make sense since 
> aggs are executed on the entire result set. therefore if it managed to fit 
> into the memory you should just get it. paging will mean that you throw 
> away a lot of results that were already calculated. the only way to "page" 
> is by limiting the results that you are running aggs on. for example if 
> your data is sorted by date and you want to build histogram for the results 
> one date range at a time."


-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/0ae54bc1-7059-4ae7-a979-191a64d068fd%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Infinite scroll best practices with ES

2014-11-09 Thread pulkitsinghal
In this discussion, I will rely on this page for 
reference: 
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-scroll.html

At my level, I cannot really make a recommendation but I can share some 
questions going through my head, which if you fill in the blanks, might 
help you come to a reasonable decision ...


   1. Flat out notice by the docs: "Scrolling is not intended for real time 
   user requests, but rather for processing large amounts of data" ... but for 
   a user scrolling an infinite window with 250 results per pseudo-page ... 
   can we really consider that real time? If not then I guess its reasonable 
   to look at scoll for infinite paging.
   2. If you have 100 users accessing your application then that means you 
   are serving the infinite-scroll screen via 100 scroll-queries ... what is a 
   reasonable timeout period for your scroll queries?
  1. Should they stay alive for 10m each ... is that how long a user 
  session is for you on average?
  2. Quoting the docs before my next thought: "The scroll parameter 
  (passed to the search request and to every scroll request) tells 
  Elasticsearch how long it should keep the search context alive. Its value 
  does not need to be long enough to process all data — it just needs to be 
  long enough to process the previous batch of results. Each scroll request 
  (with the scroll parameter) sets a new expiry time."
  3. What happens when a scroll query timeout expires? I think your app 
  will need to be smart enough to issue a new search_then_scroll chain but:
 1. the results will shift a bit if new data has been indexed
 2. your app will probably need to keep track of the fact that user 
 was on pseudo-page 5 of their infinite scroll (looking at results # 
 1000-1250) and then get the new page #6 by searching-1st and then 
scrolling 
 pages # 2,3,4,5 until it gets to page #6 ... so the app needs to be 
very 
 smart ... depending on your comfort level with writing code, you may 
or may 
 not consider this to be a big deal
 3. also there is no way to scroll backwards, which means any pages 
 you've retrieved, you'll need to cache them all on client side, does 
your 
 app have sufficient memory?
 4. will you choose to replace the cache when something like (3.2) 
 happens? I would btu again the shifting results may or may not feel 
odd to 
 the user.
  4. Will you be happy with a scroll query only performance or will you 
  want something more like the SCAN search_type also? In which case there 
is 
  no sorting at all so when a timeout happens, the rows of data will seem 
to 
  have moved around a LOT to a user with good memory who was scrolling the 
  page. So it would be best to just make users start from the beginning in 
  such a scenario
  5. I also imagine you will shorten the timeout based on how many 
  users are hammering your application so when you try and strike a balance 
  between the average user engagement time and the scroll query timeout, 
  you'll have a lowerbound determined (somewhat) by the fact that on 
average 
  users stick around on your page and scroll around for no more than 25 
  seconds on average. I'm making these metrics up but you get the idea. So 
  you wouldn't want to set a timeout smaller than 25 seconds then.
   3. The good news is if you do all this, then people will love you for 
   sharing your infinite client code :)
   4. Some practical advice: I have an angular+ionic app for selling 
   clothes,shoes etc. where I have NOT setup infinite scroll (default is like 
   1000 products), and I explain my decision to my team like so: I want my 
   users to *search* and not scroll. This is just an opinion so if you took 
   the time to write a smart client, and if it was open-source, I would 
   totally consider using it :)

Cheers!
- Pulkit

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/f0d29b57-dfe9-44e2-98b0-ce5b97a9e96f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Large results sets and paging for Aggregations

2014-11-09 Thread pulkitsinghal
Based on the reference docs I couldn't figure out what happens when the 
aggregation result set is very large. Does it get cut off? What is the 
upperbound? Does ES crash?

I see closed issues that indicate that pagination for aggregations will not 
be supported (https://github.com/elasticsearch/elasticsearch/issues/4915) 
BUT does that mean we can still get the entire result set without missing 
anything in the response?

Is the best way to do this, via a scroll (non-scan) query and that will 
give the entire aggregation result set in the very first response, no 
matter how huge?

Thanks!
- Pulkit

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/57cb8dd9-aa54-4777-b9ae-8ee0495ecba3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Can update by script be performed on query results atomically?

2014-10-14 Thread pulkitsinghal
Upon further exploration I found 
https://github.com/yakaz/elasticsearch-action-updatebyquery ... where the 
update by query API is packaged as a plugin and allows all documents with 
the query to be updated with a script.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/a13c8b4c-13b0-47ca-a3c6-d29f759149f4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Can update by script be performed on query results atomically?

2014-10-14 Thread pulkitsinghal
Just like the Update API allows us to update a document based on a script 
provided ... by *removing some network roundtrips* ... is it possible in ES 
to perform the update on all the results of a query?

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/07065c91-7389-4603-a470-29b41ac64dda%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: How to find null_value in query_string like we can in missing filter

2014-08-13 Thread pulkitsinghal
Can anyone help with this comparison between missing fields in filter vs. 
query?

On Sunday, August 3, 2014 12:18:05 PM UTC-5, pulkitsinghal wrote:
>
> I'm using elasticsearch v0.90.5
>
> With a missing filter, we can track missing fields:
>
> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-missing-filter.html
> and make sure that a null_value also counts as missing.
>
> How can we do the same in a query_string?
>
> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-query-string-query.html#_field_names
> Based on my test so far, the non-existent field counts as missing but the 
> null_value field counts as present.
>
> How should I write my query?
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/67d6303a-2694-4097-bf6d-c8ffcaaa9a58%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


How to find null_value in query_string like we can in missing filter

2014-08-03 Thread pulkitsinghal
I'm using elasticsearch v0.90.5

With a missing filter, we can track missing fields:
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-missing-filter.html
and make sure that a null_value also counts as missing.

How can we do the same in a query_string?
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-query-string-query.html#_field_names
Based on my test so far, the non-existent field counts as missing but the 
null_value field counts as present.

How should I write my query?

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/c51f170b-3775-4b8a-909c-8d00a9095c69%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Running into CORS errors when using elasticsearch-js client

2014-03-29 Thread pulkitsinghal
Yup this didn't have anything to do with the client. For anyone else who is 
a big fan of pairing elastisearch with APIGEE ... the solution is 
here: 
http://stackoverflow.com/questions/21193647/apigee-pre-flight-options-requests

On Saturday, March 29, 2014 9:30:36 AM UTC-5, pulkitsinghal wrote:
>
> I'm running into the following error message when using 
> elasticsearch-js<https://github.com/elasticsearch/elasticsearch-js/>client:
>>
>> No 'Access-Control-Allow-Origin' header is present on the requested 
>> resource. Origin 'http://localhost:9090' is therefore not allowed access.
>
>
> There is a brief mention of cross browser in the code:
>
> https://github.com/elasticsearch/elasticsearch-js/blob/30f27494644b0dd893f69d69327a5f41c723da52/src/lib/connectors/xhr.js?L-20#L20-L24
> But looking at that I'm not sure how to initialize the client any 
> differently.
>
> My topology looks like:
> express+nodejs (laptop) <-> APIGEE (proxy) <-> ElasticSearch
>
> Perhaps this is not about the elasticsearch-js client?
>
> Thanks!
> - Pulkit
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/139a61a0-2786-4d66-9a0c-55592dae28d6%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Running into CORS errors when using elasticsearch-js client

2014-03-29 Thread pulkitsinghal
I'm running into the following error message when using 
elasticsearch-jsclient:
>
> No 'Access-Control-Allow-Origin' header is present on the requested 
> resource. Origin 'http://localhost:9090' is therefore not allowed access.


There is a brief mention of cross browser in the code:
https://github.com/elasticsearch/elasticsearch-js/blob/30f27494644b0dd893f69d69327a5f41c723da52/src/lib/connectors/xhr.js?L-20#L20-L24
But looking at that I'm not sure how to initialize the client any 
differently.

My topology looks like:
express+nodejs (laptop) <-> APIGEE (proxy) <-> ElasticSearch

Perhaps this is not about the elasticsearch-js client?

Thanks!
- Pulkit

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/aa65d51d-8f5a-4a52-aa2f-ea69d6e0590b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Are certain fields excluded from being part of "_all" grouping ?

2014-03-25 Thread pulkitsinghal
After reading up on '_all' a bit more, I now realize that its not 
implemented to collect the resulting tokens from fields but their "_source" 
values instead! So ofcourse it won't work .. boo hoo :P

On Tuesday, March 25, 2014 10:53:10 AM UTC-5, pulkitsinghal wrote:
>
> I ran the following two queries on v0.90.5:
>
> POST /my_index/product/_search
>> {"query":{"bool":{"must":[{"query_string":{"default_field":"*_all*
>> ","query":"cinna"}}]}}}
>>
>  
>
>> POST /my_index/product/_search
>> {"query":{"bool":{"must":[{"query_string":{"default_field":"*name*
>> ","query":"cinna"}}]}}}
>
>
> The query with the "_all" field did not return any results but the one 
> with "name" field returned 365 results.
>
> The "name" field is mapped like so:
>
>> "name" : {
>> "analyzer": "word_break",
>> "type": "string"
>> },
>
>
> Would/should this prevent it from falling under the "_all" grouping in 
> searches? 
>
>> {
>> "index":{
>> "analysis":{
>> "analyzer":{
>> "word_break":{
>> "type": "custom",
>> "tokenizer": "standard",
>> "filter":["word_delimiter","lowercase","custom_gram"]
>> }
>> },
>> "filter":{
>> "custom_gram":{
>> "type":"ngram",
>> "min_gram":2,
>> "max_gram":7
>> }
>> }
>> }
>> }
>> }
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/c2e313b3-8a90-4091-b22e-75b74c87e071%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Are certain fields excluded from being part of "_all" grouping ?

2014-03-25 Thread pulkitsinghal
I ran the following two queries on v0.90.5:

POST /my_index/product/_search
> {"query":{"bool":{"must":[{"query_string":{"default_field":"*_all*
> ","query":"cinna"}}]}}}
>
 

> POST /my_index/product/_search
> {"query":{"bool":{"must":[{"query_string":{"default_field":"*name*
> ","query":"cinna"}}]}}}


The query with the "_all" field did not return any results but the one with 
"name" field returned 365 results.

The "name" field is mapped like so:

> "name" : {
> "analyzer": "word_break",
> "type": "string"
> },


Would/should this prevent it from falling under the "_all" grouping in 
searches? 

> {
> "index":{
> "analysis":{
> "analyzer":{
> "word_break":{
> "type": "custom",
> "tokenizer": "standard",
> "filter":["word_delimiter","lowercase","custom_gram"]
> }
> },
> "filter":{
> "custom_gram":{
> "type":"ngram",
> "min_gram":2,
> "max_gram":7
> }
> }
> }
> }
> }


-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/ed30456a-f86b-49bf-af8f-979c58e74fa4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


What is the best javascript client for ES with least # of dependencies?

2014-03-17 Thread pulkitsinghal
I want to use a javascript client for ES that has the least # of 
dependencies because it will make it easier to port it onto Parse.com as a 
cloud module. The more dependencies and/or files involved, the messier it 
looks when bringing in an existing npm module to an environment like Parse.

So far I'm inclined to use 
node-elasticsearch-clientbecause
 it has no dependencies according to its 
package.jsonfile.

The official 
elasticsearch-jsclient on 
the other hand has four other runtime dependencies.

> "dependencies": {
> "chalk": "~0.4",
> "forever-agent": "0.5.2",
> "lodash-node": "~2.4",
> "when": "~2.8"
>   },
>
> If there is already a branch for this client that's more suited to porting 
for environments like Parse.com, can someone chime in and let me know about 
it?

I'm partially inclined to just use nothing but a simple HTTP request, in 
order to be lean, but I thought it might be worth exploring if I could go 
bigger.

Please feel free to share your thoughts and pointers.

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/5db855f6-d8a1-40ee-8c20-8fb72a3848ca%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: Indexing custom date format

2014-02-23 Thread pulkitsinghal
My current mapping looks like:

'sale_date': {
  'format': 'dateOptionalTime',
  'type': 'date'
}

On Sunday, February 23, 2014 4:35:36 PM UTC-6, pulkitsinghal wrote:
>
> All the out-of-the-box date formats are available here:
>
> http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-date-format.html
>
> But other than preprocessing a custom date format before indexing the 
> data, is there anything I can do on the mapping side to allow ES to process 
> a date like:
>
> "2013-11-23 22:39:11" and not throw errors like it is malformed at " 
> 22:39:11" ?
>
> Thanks!
>
> - Pulkit
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/61bc3b33-061d-4c02-b5ab-9bfae65c5b05%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Indexing custom date format

2014-02-23 Thread pulkitsinghal
All the out-of-the-box date formats are available here:
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-date-format.html

But other than preprocessing a custom date format before indexing the data, 
is there anything I can do on the mapping side to allow ES to process a 
date like:

"2013-11-23 22:39:11" and not throw errors like it is malformed at " 
22:39:11" ?

Thanks!

- Pulkit

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/47177c44-0c2d-43d0-a838-189ae17ced32%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Using curl with bulk API: ActionRequestValidationException Validation Failed: 1: no requests added

2014-02-16 Thread pulkitsinghal
Thanks, that was it!

On Sunday, February 16, 2014 2:28:24 PM UTC-6, David Pilato wrote:
>
> Check that you have \n after each line (last line needs it as well).
>
>
> -- 
> *David Pilato* | *Technical Advocate* | *Elasticsearch.com*
> @dadoonet <https://twitter.com/dadoonet> | 
> @elasticsearchfr<https://twitter.com/elasticsearchfr>
>
>
> Le 16 février 2014 à 21:22:48, pulkitsinghal 
> (pulkit...@gmail.com) 
> a écrit:
>
>  Whenever I try to perform a bulk import from CURL, I get an error 
> stating: ActionRequestValidationException[Validation Failed: 1: no requests 
> added
>
> Can someone point out my mistake in this command?
>
> $ curl -u username:password -XPOST http://my.es.com:9200/_bulk 
> -d<http://my.es.com:9200/_bulk-d>
> '
> {"index":{"_index":"my_index_test","_type":"product","_id":"033fe3db-038f-11e3-a415-bc764e10976c"}}
> {"api_id":"033fe3db-038f-11e3-a415-bc764e10976c","name":"tray","short_description":"serve
>  
> fresh things with a fresh look","long_description":"serve fresh 
> things with a fresh look","price":19.5,"image_url":"
> https://s3.amazonaws.com/blah.jpg
> ","barcodes":["CODE_128:tray","CODE_39:tray","MANUAL:tray","MANUAL:TRAY"]}
> '
>  --
> You received this message because you are subscribed to the Google Groups 
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to elasticsearc...@googlegroups.com .
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/elasticsearch/8c46092e-814f-4f92-ac8f-2a1772721cc8%40googlegroups.com
> .
> For more options, visit https://groups.google.com/groups/opt_out.
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/bb022da2-f5bd-4f60-8af1-1423a774bf25%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Using curl with bulk API: ActionRequestValidationException Validation Failed: 1: no requests added

2014-02-16 Thread pulkitsinghal
Whenever I try to perform a bulk import from CURL, I get an error 
stating: ActionRequestValidationException[Validation Failed: 1: no requests 
added

Can someone point out my mistake in this command?

$ curl -u username:password -XPOST http://my.es.com:9200/_bulk -d'
{"index":{"_index":"my_index_test","_type":"product","_id":"033fe3db-038f-11e3-a415-bc764e10976c"}}
{"api_id":"033fe3db-038f-11e3-a415-bc764e10976c","name":"tray","short_description":"serve
 
fresh things with a fresh look","long_description":"serve fresh 
things with a fresh 
look","price":19.5,"image_url":"https://s3.amazonaws.com/blah.jpg","barcodes":["CODE_128:tray","CODE_39:tray","MANUAL:tray","MANUAL:TRAY"]}
'

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/8c46092e-814f-4f92-ac8f-2a1772721cc8%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.