How's the encoding handling power of ES?

2014-01-09 Thread HongXuan Ji
Hi all, 

I am wondering how the ElasticSearch deal with different document with 
different encoding, such as different language. 
Could you provide me some tutorial about it? Do I need to manually specify 
the encoding format of the document when posting?

Best,

Ivan

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/7d6f7334-cc7d-4a37-88c5-6237f0d29b05%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: How's the encoding handling power of ES?

2014-01-09 Thread HongXuan Ji
Hi, Jason


Thanks for the reply. I read the post.

I am also wondering what the encoding process of ES works and what's the 
underlying encoding used in ES to store data?

Do you have some documents about these?

Thanks,!

Regards,

Ivan

Jason Wee於 2014年1月9日星期四UTC+8下午10時08分26秒寫道:

 There is example in index and query in this SO 
 http://stackoverflow.com/questions/8734888/how-to-search-for-utf-8-special-characters-in-elasticsearch

 hth

 Jason


 On Thu, Jan 9, 2014 at 5:13 PM, HongXuan Ji hxu...@gmail.comjavascript:
  wrote:

 Hi all, 

 I am wondering how the ElasticSearch deal with different document with 
 different encoding, such as different language. 
 Could you provide me some tutorial about it? Do I need to manually 
 specify the encoding format of the document when posting?

 Best,

 Ivan

 -- 
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com javascript:.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/7d6f7334-cc7d-4a37-88c5-6237f0d29b05%40googlegroups.com
 .
 For more options, visit https://groups.google.com/groups/opt_out.




-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/bd32a1c8-1718-4308-bfc6-f3d91ee4f2b7%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: How many metadata fields exist of MP3 file ?

2014-01-08 Thread HongXuan Ji
Hi David,

I only got the ALBUM field by using the endpoint of Solr, which is 
HOST/solr/update/extract?extractOnly=true.
So it seems the mapper attachment does not support the extra field 
extraction. right?

BTW, can you give me some tutorial about the fsriver? I am also curious 
what's the plugin for ? What's the purpose of the plugin?

Best,

Ivan

David Pilato於 2014年1月8日星期三UTC+8下午6時23分03秒寫道:

 I would recommend not to use the mapper attachment but to manage that on 
 your side.
 I removed for example mapper attachment from fsriver project to have a 
 finer control. (see https://github.com/dadoonet/fsriver/issues/38)

 BTW, I'm not aware on how you can get ALBUM field using Tika. Any pointer? 
 Could be nice to add it to fsriver as well.

 -- 
 *David Pilato* | *Technical Advocate* | *Elasticsearch.com*
 @dadoonet https://twitter.com/dadoonet | 
 @elasticsearchfrhttps://twitter.com/elasticsearchfr


 Le 8 janvier 2014 at 10:49:47, HongXuan Ji (hxu...@gmail.com javascript:) 
 a écrit:

 Thanks for the reply. 

 Except for the six standard fields, I also want to know the extra field. 
 For example, in Solr we can extract the album field in MP3 file.
 Does this function also support in ElasticSearch? I just tested: I post a 
 mp3 file into ES, but the fields of the mp3 file contains only the six 
 fields.

 Ideas?

 Thanks a lot.

 David Pilato於 2014年1月8日星期三UTC+8下午4時34分07秒寫道: 

  Have a look at 
 https://github.com/elasticsearch/elasticsearch-mapper-attachments/blob/master/src/main/java/org/elasticsearch/index/mapper/attachment/AttachmentMapper.java#L376
  
  You will see that mapper attachment reads:
  
  Metadata.DATE
  Metadata.TITLE
  Metadata.AUTHOR
  Metadata.KEYWORDS
  Metadata.CONTENT_TYPE
  Metadata.CONTENT_LENGTH
  
  Does it help?

  -- 
 *David Pilato* | *Technical Advocate* | *Elasticsearch.com* 
  @dadoonet https://twitter.com/dadoonet | 
 @elasticsearchfrhttps://twitter.com/elasticsearchfr
  

 Le 8 janvier 2014 at 05:05:10, HongXuan Ji (hxu...@gmail.com) a écrit:

  Hi all, 

 I am wondering how many metadata fields of MP3 files exist when I post 
 the mp3 file into ElasticSearch using the mapper-attachment. 

 Because in Solr we can know the field information through the endpoint 
 SOLR_HOST/update/extract?extractOnly=true, 

 but in ElasticSearch are there any ways to get such informations?  Except 
 for the MP3 files, how about the doc files? 

 I know the ElasticSearch use tika to support this operations, can you 
 give me some example to fetch some special field of some special file 
 format?

 Regards,

 Ivan 


  --
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/742f86b9-9dd8-4354-ae50-26332f0c4dc0%40googlegroups.com
 .
 For more options, visit https://groups.google.com/groups/opt_out.
  
   --
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com javascript:.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/4495d489-6a3f-4b57-95a2-eefccbe48cf7%40googlegroups.com
 .
 For more options, visit https://groups.google.com/groups/opt_out.



-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/00fe2081-0f22-400f-a0be-78ee5687ee10%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Re: How many metadata fields exist of MP3 file ?

2014-01-08 Thread HongXuan Ji
OK, I will post the issue later.

About the river, 

The first line: This river plugin helps to index documents from your local 
file system and using SSH. 

Does it means   I store a bunch of pdf file in my local directory and by 
using the river plugin I can search the file in the directory.  ?

In fact, I started to study ElasticSearch this week and I am not very 
familiar the filesystem means here.
Thanks a lot.

Ivan
David Pilato於 2014年1月8日星期三UTC+8下午7時32分17秒寫道:

 Mapper attachment does not support extra field extraction. May be you 
 could open an issue there? 
 https://github.com/elasticsearch/elasticsearch-mapper-attachments 

 About FSRiver, I guess everything is described here: 
 https://github.com/dadoonet/fsriver#filesystem-river-for-elasticsearch
 Is there something you don't understand?


 -- 
 *David Pilato* | *Technical Advocate* | *Elasticsearch.com*
 @dadoonet https://twitter.com/dadoonet | 
 @elasticsearchfrhttps://twitter.com/elasticsearchfr


 Le 8 janvier 2014 at 12:24:11, HongXuan Ji (hxu...@gmail.com javascript:) 
 a écrit:

 Hi David, 

 I only got the ALBUM field by using the endpoint of Solr, which is 
 HOST/solr/update/extract?extractOnly=true.
 So it seems the mapper attachment does not support the extra field 
 extraction. right?

 BTW, can you give me some tutorial about the fsriver? I am also curious 
 what's the plugin for ? What's the purpose of the plugin?

 Best,

 Ivan

 David Pilato於 2014年1月8日星期三UTC+8下午6時23分03秒寫道: 

  I would recommend not to use the mapper attachment but to manage that 
 on your side.
  I removed for example mapper attachment from fsriver project to have a 
 finer control. (see https://github.com/dadoonet/fsriver/issues/38)
  
  BTW, I'm not aware on how you can get ALBUM field using Tika. Any 
 pointer? Could be nice to add it to fsriver as well.

  -- 
 *David Pilato* | *Technical Advocate* | *Elasticsearch.com* 
  @dadoonet https://twitter.com/dadoonet | 
 @elasticsearchfrhttps://twitter.com/elasticsearchfr
  

 Le 8 janvier 2014 at 10:49:47, HongXuan Ji (hxu...@gmail.com) a écrit:

  Thanks for the reply. 

 Except for the six standard fields, I also want to know the extra field. 
 For example, in Solr we can extract the album field in MP3 file.
 Does this function also support in ElasticSearch? I just tested: I post a 
 mp3 file into ES, but the fields of the mp3 file contains only the six 
 fields.

 Ideas?

 Thanks a lot.

 David Pilato於 2014年1月8日星期三UTC+8下午4時34分07秒寫道: 

  Have a look at 
 https://github.com/elasticsearch/elasticsearch-mapper-attachments/blob/master/src/main/java/org/elasticsearch/index/mapper/attachment/AttachmentMapper.java#L376
  
  You will see that mapper attachment reads:
  
  Metadata.DATE
  Metadata.TITLE
  Metadata.AUTHOR
  Metadata.KEYWORDS
  Metadata.CONTENT_TYPE
  Metadata.CONTENT_LENGTH
  
  Does it help?

  -- 
 *David Pilato* | *Technical Advocate* | *Elasticsearch.com* 
  @dadoonet https://twitter.com/dadoonet | 
 @elasticsearchfrhttps://twitter.com/elasticsearchfr
  

 Le 8 janvier 2014 at 05:05:10, HongXuan Ji (hxu...@gmail.com) a écrit:

  Hi all, 

 I am wondering how many metadata fields of MP3 files exist when I post 
 the mp3 file into ElasticSearch using the mapper-attachment. 

 Because in Solr we can know the field information through the endpoint 
 SOLR_HOST/update/extract?extractOnly=true, 

 but in ElasticSearch are there any ways to get such informations? 
  Except for the MP3 files, how about the doc files? 

 I know the ElasticSearch use tika to support this operations, can you 
 give me some example to fetch some special field of some special file 
 format?

 Regards,

 Ivan 


  --
 You received this message because you are subscribed to the Google 
 Groups elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send 
 an email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/742f86b9-9dd8-4354-ae50-26332f0c4dc0%40googlegroups.com
 .
 For more options, visit https://groups.google.com/groups/opt_out.
  
   --
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/4495d489-6a3f-4b57-95a2-eefccbe48cf7%40googlegroups.com
 .
 For more options, visit https://groups.google.com/groups/opt_out.
  
   --
 You received this message because you are subscribed to the Google Groups 
 elasticsearch group.
 To unsubscribe from this group and stop receiving emails from it, send an 
 email to elasticsearc...@googlegroups.com javascript:.
 To view this discussion on the web visit 
 https://groups.google.com/d/msgid/elasticsearch/00fe2081-0f22-400f-a0be-78ee5687ee10%40googlegroups.com
 .
 For more options, visit https

How many metadata fields exist of MP3 file ?

2014-01-07 Thread HongXuan Ji
Hi all,

I am wondering how many metadata fields of MP3 files exist when I post the 
mp3 file into ElasticSearch using the mapper-attachment. 

Because in Solr we can know the field information through the endpoint 
SOLR_HOST/update/extract?extractOnly=true, 

but in ElasticSearch are there any ways to get such informations?  Except 
for the MP3 files, how about the doc files? 

I know the ElasticSearch use tika to support this operations, can you give 
me some example to fetch some special field of some special file format?

Regards,

Ivan 


-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/742f86b9-9dd8-4354-ae50-26332f0c4dc0%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


Cannot search the field of the attachment type?

2014-01-06 Thread HongXuan Ji
Dear all, 

I cannot able to query the field of attachment type. I followed the 
instruction 
in http://es-cn.medcl.net/tutorials/2011/07/18/attachment-type-in-action.html.

And the result of the search query:

curl http://localhost:9200/_search?pretty=true; -d '{
  fields : [title],
  query : {
query_string : {
  query : amplifier
}
  },
  highlight : {
fields : {
  file : {}
}
  }
}'

result:

{
  took : 1,
  timed_out : false,
  _shards : {
total : 1,
successful : 1,
failed : 0
  },
  hits : {
total : 0,
max_score : null,
hits : [ ]
  }
}


My ElasticSearch is the latest version, elasticsearch-0.90.9 and the plugin of 
mapper-attachment is 1.9.0 
(https://github.com/elasticsearch/elasticsearch-mapper-attachments).

In fact, the environment finished the setup yesterday. I have no clue why it 
cannot find anything.


Any ideas?


Regards,


Ivan

-- 
You received this message because you are subscribed to the Google Groups 
elasticsearch group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/d0768325-af0d-4aeb-ae2d-499fac1ca08a%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.