Re: EC2 Discovery

2014-03-21 Thread ZenMaster80
I am not sure if I missed something, but what you mentioned I believe I already tried as showing in my original post. I can connect from one instance to another. I can connect to each machine individually and I am able to index and query it fine with default configuration without any zen or ec2 s

Re: EC2 Discovery

2014-03-21 Thread ZenMaster80
I am not sure if I missed something, but what you mentioned I believe I already tried as showing in my original post. I can connect to each machine individually and I am able ti index and query it fine with default configuration without any zen or ec2 settings. But, when I turned them on like I

EC2 Discovery

2014-03-20 Thread ZenMaster80
Any clues to what i am missing, i turned discovery trace on, but dont't see any useful info. -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsu

Re: Occational client.transport.NoNodeAvailableException

2014-03-14 Thread ZenMaster80
I will post logs in a bit. I plan to wun on EC2, but currently just running on a local machine i7, 4G Ram. I had int concurrentRequests = Runtime.getRuntime().availableProcessors(); (Returns 8), If I change this value to just "1", I don't get the exception, but indexing performance slows dow

Re: Occational client.transport.NoNodeAvailableException

2014-03-14 Thread ZenMaster80
rsday, March 13, 2014 2:02:36 PM UTC-4, ZenMaster80 wrote: > > I get this during bulk processing (see the Error in the middle of the log) > - I tried it on my local machine, it doesn't happen, if I run it on some > remote EC2 instance, I do get the exception. Here is so

Re: Bulk Processor

2014-03-14 Thread ZenMaster80
basically flush the Bulk every n-1 docs > instead of n. > > Fix is on the way. > > -- > David ;-) > Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs > > > Le 12 mars 2014 à 20:51, ZenMaster80 > a > écrit : > > > I don't quite undertsand what

Re: How to install Mapping attachment Plugin with debian install

2014-03-13 Thread ZenMaster80
Pilato* | *Technical Advocate* | *Elasticsearch.com* > @dadoonet <https://twitter.com/dadoonet> | > @elasticsearchfr<https://twitter.com/elasticsearchfr> > > > Le 13 mars 2014 à 17:19:49, ZenMaster80 (sabda...@gmail.com ) > a écrit: > > On my local mach

How to install Mapping attachment Plugin with debian install

2014-03-13 Thread ZenMaster80
On my local machine, I do this: bin/plugin -install ... With debian installation, I am not sure where the "bin/plugin' folder is? Anyone knows? -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving e

Re: [Ann] Elasticsearch Image Plugin 1.1.0 released

2014-03-13 Thread ZenMaster80
Great, I am interested in trying this. On Thursday, March 13, 2014 7:09:38 AM UTC-4, Kevin Wang wrote: > > Hi All, > > I've released version 1.1.0 of Elasticsearch Image Plugin. > The Image Plugin is an Content Based Image Retrieval Plugin for > Elasticsearch using LIRE (Lucene Image Retrieval).

Mapping Attachment plugin Installtion/dubian

2014-03-13 Thread ZenMaster80
I am having trouble finding how to install the above plugin? I installed Elastic Search with Dubian. Typically On my local linux machine I did "/bin/plugin ", I am not sure where is the 'bin/plugin" goes with the dubian installation? Thanks -- You received this message because you are subs

Re: Bulk Processor question

2014-03-12 Thread ZenMaster80
e number of actions (as you use by > setting it to 1000) or a bulk request byte volume (default 5M). What you > see is the 5M limit kicking in, your docs are quite large. > > Jörg > > > On Wed, Mar 12, 2014 at 8:54 PM, ZenMaster80 > > wrote: > >> >> I don&#

Bulk Processor question

2014-03-12 Thread ZenMaster80
I don't quite undertsand what the bulk processor is doing, I would like someone to explain how it is supposed to work to make sure I designed this correctly. I specify the number of actions 1000. My feeder keeps pushing documents to it "Its more like a loop iterating documents folders" where I

Bulk Processor

2014-03-12 Thread ZenMaster80
I don't quite undertsand what the bulk processor is doing this, I would like someone to explain how it is upposed to work to make sure I designed this correctly. I specify the number of actions 1000. my feeder keeos pushing documents to it "Its more like a loop iterating documents folders", and

Re: BulkProcessor

2014-03-07 Thread ZenMaster80
> Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs > > Le 7 mars 2014 à 21:51, ZenMaster80 > a > écrit : > > if I set Bulk size number of files at 5000, I feed it 5000, 5000, 5000, > what happens if the #of files for instance in the last batch is 2000. How > doe

BulkProcessor

2014-03-07 Thread ZenMaster80
if I set Bulk size number of files at 5000, I feed it 5000, 5000, 5000, what happens if the #of files for instance in the last batch is 2000. How does it know that it needs to process the last 2000 ? -- You received this message because you are subscribed to the Google Groups "elasticsearch" g

Re: indexing binary

2014-02-27 Thread ZenMaster80
Sorry for the confusion - I do want PDFs, but I am concerned with the retrieval of the image file when it ocr text is searched. I must be missing something. As showing below, I provide two fields "text" and the "content". In your second post you say I don't need the "content' field for images? S

Re: indexing binary

2014-02-27 Thread ZenMaster80
Binh, Thanks, With your help I think I am closer to the answer. Wih the sample mapping you provided, I should be able to provide the base 64 contents of the image file as the "contents" field, and the ocrtext as "text field. So, when the ocr text is searched, i can return the "content" which is

Re: indexing binary

2014-02-27 Thread ZenMaster80
Thanks, it sounds like you are treating it as an attachment, In your example, what is the "fileContents" in .field("content", fileContents) ? How do I get file contents of an image, I know in the case of the pdf, this is content text of the pdf. Correct, I don't want to index the image binary,

indexing binary

2014-02-26 Thread ZenMaster80
I index PDFs using apache with the following mapping. .field( "type", "attachment" ) .field("fields") .startObject() .startObject("file") .field("store", "yes") .endObject() I want to index photos, I am able to extract text using OCR. I am confused how to index the text though, do I treat

Re: Indexing Images

2014-02-20 Thread ZenMaster80
11:38:31 AM UTC-5, ZenMaster80 wrote: > > I am a bit confused about this topic, I would like to index images > (png,jpegs, gifs...), my understanding is that I need to extract and index > text portions from images, I don't really care for the meta data. So, I > looked onlin

Indexing Images

2014-02-20 Thread ZenMaster80
I am a bit confused about this topic, I would like to index images (png,jpegs, gifs...), my understanding is that I need to extract and index text portions from images, I don't really care for the meta data. So, I looked online and decided to use apache Tika which I also use to extract text and

Re: TransportSerializationException: Failed to deserialize exception response from stream

2014-02-20 Thread ZenMaster80
I ran into same problem, version was correct, plugins installed, In my case port 9300 was not opened for transportclient, once I opened it, it worked fine. On Thursday, February 20, 2014 9:06:42 AM UTC-5, Tiago Rodrigues wrote: > > I get this error sometimes when I try to create an index. > > M

Re: Searching PDF

2014-02-07 Thread ZenMaster80
You are correct, my JSON mapping had a wrong entry. Thanks for the help! On Friday, February 7, 2014 6:10:50 PM UTC-5, Binh Ly wrote: > > It looks like that indexing code might not be correct. I just tried this > code and it works for me: > > try { > String fileContents = readConten

Re: Searching PDF

2014-02-07 Thread ZenMaster80
So, What's wrong with this? GET localhost:9200/_search { "fields": "file", "query": { "match_all": {} } } .. "hits": { "total": 1, "max_score": 1, "hits": [ { "_index": "docs", "_type": "pdf", "_id": "1", "_sc

Re: Elastic Search deleting some files while indexing?

2014-02-07 Thread ZenMaster80
gt; > BTW elasticsearch does not delete documents unless you set _ttl. > > -- > *David Pilato* | *Technical Advocate* | *Elasticsearch.com* > @dadoonet <https://twitter.com/dadoonet> | > @elasticsearchfr<https://twitter.com/elasticsearchfr> > > > Le 7 fé

Re: Elastic Search deleting some files while indexing?

2014-02-07 Thread ZenMaster80
first check my documents. > > BTW elasticsearch does not delete documents unless you set _ttl. > > -- > *David Pilato* | *Technical Advocate* | *Elasticsearch.com* > @dadoonet <https://twitter.com/dadoonet> | > @elasticsearchfr<https://twitter.com/elasticsearchfr&g

Re: Elastic Search deleting some files while indexing?

2014-02-07 Thread ZenMaster80
asticsearch.com* > @dadoonet <https://twitter.com/dadoonet> | > @elasticsearchfr<https://twitter.com/elasticsearchfr> > > > Le 7 février 2014 at 15:58:15, ZenMaster80 (sabda...@gmail.com) > a écrit: > > I am indexing about 5000 documents, when indexing is done, I use

Elastic Search deleting some files while indexing?

2014-02-07 Thread ZenMaster80
I am indexing about 5000 documents, when indexing is done, I use "HEAD" plugin, it says it indexed 4950 docs and deleted 50 files, also verified by curl that only 4950 indexed. I couldn't see anything in the logs, but how/when/why does Elasticsearch decide to delete some of the docs? -- You re

searching while indexing

2014-02-06 Thread ZenMaster80
I am unclear on how does searching work while indexing. lets say I already have a document indexed (version 1), and I updated the document, so I will index it again (version 2), what happens when the user is searching while indexing version 2? Will the user get results from version 1? -- You r

Re: Improving Bulk Indexing

2014-02-04 Thread ZenMaster80
but SAS-2 (6Gbit/sec) RAID-0 drives of ~1TB per server. > > Jörg > > > > On Tue, Feb 4, 2014 at 5:22 PM, ZenMaster80 > > wrote: > >> Jörg, >> >> Great, I learned a lot about the process from your responses. Could you >> elaborate more on your

Re: Improving Bulk Indexing

2014-02-04 Thread ZenMaster80
Jörg, Great, I learned a lot about the process from your responses. Could you elaborate more on your use case, mine I think will be similar to yours where processing/feeding is on one server and I will use transport client, index nodes will be on EC2. So, when I do get to setting up Ec2 nodes,

Re: Improving Bulk Indexing

2014-02-03 Thread ZenMaster80
against a maximum concurrency limit. If the cluster is fast, the > client receives responses almost instantly, and the client can decide if it > is more appropriate to increase bulk request size or concurrency. > > Does it make sense? > > Jörg > > > > > On Mon, Feb 3,

Re: Improving Bulk Indexing

2014-02-03 Thread ZenMaster80
Jörg, Just so I understand this, if I were to index 100 MB worth of data total with chunk volumes of 5 MB each, this means I have to index 20 times.If I were to set the bulk size to 20 MB, I will have to index 5 times. This is a small data size, picture I have millions of documents. Are you sa

Re: Improving Bulk Indexing

2014-02-03 Thread ZenMaster80
Jörg, Thanks for the tips. I meant 64 MB for chunks volume not the heap size (sorry). I thought that was normal as I was thinking bigger chunks and less index transactions vs less chunks and more index transactions, basically I was thinking if I index smaller chunks its going to take a lot long

Improving Bulk Indexing

2014-02-01 Thread ZenMaster80
I would appreciate if I can get some tips and others perspective on bulk indexing since I am new to this. The end goal is to index 10 to 20 million document. So, I started working on my local machine with a sample of about 100 MB worth and used whatever the default Elasticsearch configuration i

Re: Loading JSON to ElasticSearch

2014-01-28 Thread ZenMaster80
* > @dadoonet <https://twitter.com/dadoonet> | > @elasticsearchfr<https://twitter.com/elasticsearchfr> > > > Le 28 janvier 2014 at 16:46:06, ZenMaster80 (sabda...@gmail.com) > a écrit: > > I would like to get your perspective on how to load json to index server >

Re: Loading JSON to ElasticSearch

2014-01-28 Thread ZenMaster80
* > @dadoonet <https://twitter.com/dadoonet> | > @elasticsearchfr<https://twitter.com/elasticsearchfr> > > > Le 28 janvier 2014 at 16:46:06, ZenMaster80 (sabda...@gmail.com) > a écrit: > > I would like to get your perspective on how to load json to index server >

Loading JSON to ElasticSearch

2014-01-28 Thread ZenMaster80
I would like to get your perspective on how to load json to index server in my scenario. We have about 15 million documents in html/pdf/... on Server 1 I would like to process the data and convert to json on server 2 I would like the indexer to index json n a separate machine/server server 3 Idea

Re: TransportClient not connecting

2014-01-23 Thread ZenMaster80
en index them… > > > -- > *David Pilato* | *Technical Advocate* | *Elasticsearch.com* > @dadoonet <https://twitter.com/dadoonet> | > @elasticsearchfr<https://twitter.com/elasticsearchfr> > > > Le 23 janvier 2014 at 19:02:12, ZenMaster80 (sabda...@gmail.com)

Re: TransportClient not connecting

2014-01-23 Thread ZenMaster80
Ok great, I already prepare the files using java, so I thought it would be a great spot to index it as well in java (slight better performance over http I am guessing). I struggled to find decent examples on indexing files via http, I wouldn't mind testing with it as well if you can point to som

Native client or REST

2014-01-23 Thread ZenMaster80
I thought I understood this, but maybe not. I hope someone can shed some light on this. I have to index tons of files and I would like to be able to query it from our web application written in javascript, all will be running on AWS EC2. Question: If I index the files using native JAVA API, will

Re: TransportClient not connecting

2014-01-23 Thread ZenMaster80
Jörg, Thanks, I did understand "local" to be something different. I want to get something straight, so for production with aws integration, I would like to index tons of files using native protocol. I should be able to query the es server from java script application using http API, at least thi

Re: TransportClient not connecting

2014-01-22 Thread ZenMaster80
the exception right away, or some time after starting up your > client? > > Ross > > > On Thursday, 23 January 2014 10:34:40 UTC+11, ZenMaster80 wrote: >> >> Yes, I do have 0.90.9 across the board. >> I know 9300 is opened. >> I am not sure how to check if bot

Re: TransportClient not connecting

2014-01-22 Thread ZenMaster80
Yes, I do have 0.90.9 across the board. I know 9300 is opened. I am not sure how to check if both are using same JVM? es.yml is default, default clustername, nodename .. I only have the default (1 Node)... Do I need to specify unicast instead of the default which I believe uses multiCast? On Wed

Re: TransportClient not connecting

2014-01-22 Thread ZenMaster80
:59:00 PM UTC-5, David Pilato wrote: > > You don't use Maven for your project? > If not, don't forget to add all needed dependencies. > > -- > David ;-) > Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs > > > Le 22 janv. 2014 à 20:23, ZenMaster80

Re: TransportClient not connecting

2014-01-22 Thread ZenMaster80
Brian, This is no different from what I have. I googled the problem, and I guess this may come from the fact that ES js using a different java version. I have added the es 0.90.0.jar to java from the es installation folder. I have no clue what I am missing. On Wednesday, January 22, 2014 2:02:

Re: TransportClient not connecting

2014-01-22 Thread ZenMaster80
Anyone using transportclient from java? On Wednesday, January 22, 2014 12:04:30 PM UTC-5, ZenMaster80 wrote: > > I can't seem to figure out this problem, Node from NodeBuilder works, but > If I use transportclient like below, I get an exception. > //I am using all

TransportClient not connecting

2014-01-22 Thread ZenMaster80
I can't seem to figure out this problem, Node from NodeBuilder works, but If I use transportclient like below, I get an exception. //I am using all default settings //elasticsearch-0.90.9 Settings settings = ImmutableSettings.settingsBuilder().put("cluster.name", "elasticsearch").build(); InetSo

Return specific field and highlights via Java API

2014-01-20 Thread ZenMaster80
I am having two issues using the java api 1. I am not able to return specific field in my search query - It shows I have the right number of results, but displays Null 2. Not return highlights Note: Assume Indexing is fine, because I am able to get correct results if comment out the line .AddFiel

Re: Indexing PDF and other binary formats

2014-01-16 Thread ZenMaster80
; > About versions, elasticsearch does not keep old versions around. If you > need that, you have to manage it yourself. > > HTH > > -- > David ;-) > Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs > > Le 16 janv. 2014 à 20:42, ZenMaster80 > > a écrit : > >

Indexing PDF and other binary formats

2014-01-16 Thread ZenMaster80
- Is there any literature on how to index pdf documents and binary formats like images? - Versioning question: If I update an already indexed document, I believe ES will update the version number. I am wondering if it keeps the previous document, what if I needed access to the previous document?

Re: How to query Elastic Search from my web app?

2014-01-15 Thread ZenMaster80
s-and-the-browser/ > > -- > David ;-) > Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs > > > Le 16 janv. 2014 à 06:18, ZenMaster80 > > a écrit : > > I am not very clear on how to do this, I have the following scenario: > My data/docs are indexed using scala nati

How to query Elastic Search from my web app?

2014-01-15 Thread ZenMaster80
I am not very clear on how to do this, I have the following scenario: My data/docs are indexed using scala native Java API. - I would like to use the REST http API to access ES, What I would like to understand is how can I query ES server from my web application written in Java Script, are there

Re: How to approach Indexing for a newbie?

2014-01-14 Thread ZenMaster80
Never mind, I just had to import more jars from /lib. On Tuesday, January 14, 2014 8:26:43 PM UTC-5, ZenMaster80 wrote: > > Thanks. I added the .jar as a dependency in a simple java project using > eclipse. > I get this error when I try to run the program, any clues? > > E

Re: How to approach Indexing for a newbie?

2014-01-14 Thread ZenMaster80
Thanks. I added the .jar as a dependency in a simple java project using eclipse. I get this error when I try to run the program, any clues? Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/lucene/util/Version at org.elasticsearch.Version.(Version.java:42) at org.elasticse

Re: How to approach Indexing for a newbie?

2014-01-14 Thread ZenMaster80
Wow, this is exactly what I was looking for. I am a bit curious on #5, I am assuming there is a Java API to access ES, is there any link on how to get started using Java with ES? I would like to know how to import ES framework/API into java project. Thanks again, this is a great clarification!

Re: How to approach Indexing for a newbie?

2014-01-14 Thread ZenMaster80
I will take a look at this in more details. But is there a simple answer to this question, lets say I have a folder with 5 json documents locally doc1...doc5. How do I go about indexing the folder/documents? On Tuesday, January 14, 2014 2:12:41 PM UTC-5, InquiringMind wrote: > > This is getting

Re: How to approach Indexing for a newbie?

2014-01-14 Thread ZenMaster80
I will take a look at this in more details. But is there a simple answer to this question, lets say I have a folder with 5 json documents locally doc1...doc5. How do I do about indexing the folder/documents? On Tuesday, January 14, 2014 2:12:41 PM UTC-5, InquiringMind wrote: > > This is getting

How to approach Indexing for a newbie?

2014-01-14 Thread ZenMaster80
I have a project that used an old search engine and I would like to move things to ElasticSearch. I have been doing some reading, and I wanted some perspective on how to approach the problem. - I have bundles(folders) of text/html/pdf/img documents, each folder has an average of 50-100 documents

Re: How to index an existing json file

2014-01-08 Thread ZenMaster80
Thank you for the binary flag tip. It is also in the documentation here: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/docs-bulk.html On Wednesday, January 8, 2014 4:19:35 AM UTC-5, Jörg Prante wrote: > > Use the binary data flag or curl does not work correctly. > > curl -

Re: How to index an existing json file

2014-01-08 Thread ZenMaster80
Thank you for the binary flag tip. It is also in the documentation here: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/docs-bulk.html On Tuesday, January 7, 2014 9:00:33 PM UTC-5, ZenMaster80 wrote: > > Hi, > > I am just starting with ElasticSearch, I would

Re: How to index an existing json file

2014-01-07 Thread ZenMaster80
l command, so in your example it should be > in the same directory in which you executed the command (current directory). > > -- > Ivan > > > On Tue, Jan 7, 2014 at 6:00 PM, ZenMaster80 > > wrote: > >> Hi, >> >> I am just starting with ElasticSe

How to index an existing json file

2014-01-07 Thread ZenMaster80
Hi, I am just starting with ElasticSearch, I would like to know how to index a simple json document "books.json" that has the following in it: Where do I place the document? I placed it in root directory of elastic search and in /bin folder.. {“books”:[{“name”:”life in heaven”,”author”:”Mike S