Re: Not able to search long values

2014-10-15 Thread Dan Fairs
Hi


> And I tried searching the CUSTOMER_NUMBER and MERCHANT_NUMBER from Head
> plugin as well as from Kibana UI.
> But its not getting searched. I even tried to explicitly provide 
> *"index":"not_analyzed"
> *for CUSTOMER_NUMBER and MERCHANT_NUMBER.
> Even that didn't work. Only after changing these fields to String, I was
> able to search on these fields. [Eg: CUSTOMER_NUMBER  1234567899876543210]
> Can someone tell me why is the behaviour like this?
>
>
It's probably Kibana. Javascript has problems with very large numbers - we
encountered a similar problem recently while messing around with head. Try
running your query directly with curl, or something else that doesn't have
javascript in the pipeline.

(Interesting aside: we also noticed that the popular 'jq' utility also has
problems with long values...)

Cheers,
Dan

-- 
Dan Fairs  | @danfairs

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAJn_wTHham95PQ1JdEZiqE388cXOqcFQEVxdVCJnU6%3D54mMCwA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


Re: Get total of distinct values in a field

2014-03-03 Thread Dan Fairs
> I'm indexing a lot of information from twitter and other social media, is 
> there a way I can get a count of all the distinct terms present in a field ? 
> I've seen something mentioned/closed on 
> https://github.com/elasticsearch/elasticsearch/issues/1044. But even with the 
> new aggregation framework I see how this would be possible, in this 
> particular case I'm trying to get the total number of authors (which could go 
> very high), I know that by using the facet terms I could all count of all 
> author, but I'm interested only in the total.


It's not ideal, but we do this by creating a term facet on the field in 
question, and counting the number of entries.

I'd also note that we don't do this in live web requests, but in batch jobs, 
and store the result!

Cheers,
Dan
--
Dan Fairs | dan.fa...@gmail.com | @danfairs | secondsync.com

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/C08902A7-958D-4804-8023-F57E4950EFBB%40gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Relation Between Heap Size and Total Data Size

2014-02-26 Thread Dan Fairs
> So, I am wondering that is there any relationship between heap size and total 
> data size? Is there any formula to determine heap size based on data size?


You might want to check that you're not running out of file handles:

  http://www.elasticsearch.org/tutorials/too-many-open-files/

Cheers,
Dan

--
Dan Fairs | dan.fa...@gmail.com | @danfairs | secondsync.com

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/FDC26956-0E46-4E2B-9A0D-1F899DDFD015%40gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: [ANN] Knapsack export/import plugin released for 0.19.8 0.19.11 0.20.6 0.90.11 1.0.0

2014-02-20 Thread Dan Fairs
Hi Jörg

> to support five versions of Elasticsearch:
> 
> 0.19.8

Thanks again for releasing this - works very well. While we're not going to be 
using it, I did try the S3 version too - but it didn't seem to work. Here's the 
traceback (I used real access key, secret and bucket in my trial, of course!):

[2014-02-20 11:16:38,581][INFO 
][org.xbib.elasticsearch.action.RestExportAction] [Trapper] starting export: 
{"started":"2014-02-20T11:16:38.548Z","path":"unittest_transmissions.tar.gz","map":{},"uri":null,"copy":false,"s3":true}
[2014-02-20 11:16:38,607][INFO 
][org.xbib.elasticsearch.action.RestExportAction] [Trapper] getting settings 
for indices [unittest_transmissions]
[2014-02-20 11:16:38,611][INFO 
][org.xbib.elasticsearch.action.RestExportAction] [Trapper] found indices: 
[unittest_transmissions]
[2014-02-20 11:16:38,620][INFO 
][org.xbib.elasticsearch.action.RestExportAction] [Trapper] getting mappings 
for index unittest_transmissions and types []
[2014-02-20 11:16:38,636][INFO 
][org.xbib.elasticsearch.action.RestExportAction] [Trapper] found mappings: 
[transmissions]
[2014-02-20 11:16:38,636][INFO 
][org.xbib.elasticsearch.action.RestExportAction] [Trapper] adding mapping: 
transmissions
[2014-02-20 11:16:38,728][INFO 
][org.xbib.elasticsearch.action.RestExportAction] [Trapper] end of export: 
{"started":"2014-02-20T11:16:38.548Z","path":"unittest_transmissions.tar.gz","map":{},"uri":null,"copy":false,"s3":true}
[2014-02-20 11:16:38,729][INFO 
][org.xbib.elasticsearch.action.RestExportAction] [Trapper] trying to transfer 
file unittest_transmissions.tar.gz to s3 :y...@s3.amazonaws.com?bucketName= 
mybucket&key=2014-02-20-df-test
Exception in thread 
"elasticsearch[Trapper][knapsack-export-[file:unittest_transmissions.tar.gz]][T#1]"
 java.util.ServiceConfigurationError: no connection factory found for 
:y...@s3.amazonaws.com?bucketName=mybucket&key=2014-02-20-df-test
at 
org.xbib.io.ConnectionService.getConnectionFactory(ConnectionService.java:47)
at 
org.xbib.elasticsearch.action.RestExportAction$ExportThread.run(RestExportAction.java:390)
at java.lang.Thread.run(Thread.java:744)

This is ES 0.19.8, Java 1.7.0_45.

Again, this doesn't actually matter for us (I was just playing with it) but 
thought you'd like to know this didn't quite work as expected.

Cheers,
Dan
--
Dan Fairs | dan.fa...@gmail.com | @danfairs | secondsync.com

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/4C15A2D9-B16C-41BC-9207-9D2B96BB1222%40gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Knapsack Plugin

2014-02-15 Thread Dan Fairs
> I have just uploaded knapsack plugin 0.19.8.1 which is compatible to ES 0.19.8
> 
> https://github.com/jprante/elasticsearch-knapsack/tree/0.19
> 
> It contains some minor fixes which I am about to port to knapsack versions 
> for 0.19.11, 0.20.6, 0.90.11, and of course 1.0.0

Thanks so much for that - will give it a try tomorrow.

Thanks,
Dan

--
Dan Fairs | dan.fa...@gmail.com | @danfairs | secondsync.com

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/66A299E5-22E2-4D1E-8B15-DA37716CED5C%40gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Knapsack Plugin

2014-02-15 Thread Dan Fairs
> 0.19.8 needs a bit of backporting the BulkProcessor and some minor adaptions 
> regarding method signature changes.
> 
> I will come back with the announcement of the 0.19.8 plugin zip when it's 
> ready.
> 

That's excellent. Again - many thanks.

Cheers,
Dan
--
Dan Fairs | dan.fa...@gmail.com | @danfairs | secondsync.com

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/692B86DC-2CA1-4542-B958-7F59A9B419C4%40gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: Knapsack Plugin

2014-02-15 Thread Dan Fairs
Hi Jörg

> Dan, no problem, I can build a version against 0.19.8

That would be much appreciated - many thanks! :)

(Happy to have a go myself if you can point me towards some basic instructions 
- but I suspect it's a 5-minute thing for you, and two days for me, as a Python 
person!)

Let me know where to grab the binaries from and I'll take it for a spin.

Cheers,
Dan
--
Dan Fairs | dan.fa...@gmail.com | @danfairs | secondsync.com

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/30035577-94BA-4C07-906D-B5DDA7033439%40gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


Knapsack Plugin

2014-02-14 Thread Dan Fairs
Hi Jörg,

Is there an (old) version of your Knapsack plugin about which was compatible 
with ES 0.19.8? Now 1.0 is out, we're keen to start using snapshot/restore - 
but I need to make a copy of our production ES dataset to try a migration on! 
The compatibility table on GH goes back to 0.19.11.

Cheers,
Dan
--
Dan Fairs | dan.fa...@gmail.com | @danfairs | secondsync.com

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/6A01F468-9900-4EE8-83CF-ECD9C4EAF13F%40gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: ES 0.20.5 adding a new node to a running cluster (unicast mode)

2014-02-03 Thread Dan Fairs
> yes of course in one work: unicast seems like an overhead ...  , think
> of a scenario that you need to add more nodes to the cluster (pretty
> common one for us) you need to change the yml config on each node and
> restart the cluster Node by Node making sure that all other nodes
> "sees" the new node 

I'm pretty sure this is incorrect actually. I suspect if you had added some 
existing nodes' IPs to the discovery list on the new node, all the nodes would 
have formed a cluster. Information about what nodes are in the cluster is 
stored canonically on the master and replicated around - once a node is known 
about, its existence should be communicated to the other nodes.

Cheers,
Dan
--
Dan Fairs | dan.fa...@gmail.com | @danfairs | secondsync.com

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/E997488E-7B62-4966-9074-68F55A7345EA%40gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.