Sorry for delayed response,
i am using 1.3 version ,i was able to change limit,field data circut
breaker,i changed it to 80 ,this is nice setting to know .
but it doesn't work ,may be heap size is my problem ,but i have very
limited heap space .
Thanks you.
On Friday, September 5, 2014 2:19:25
Could you turn on debug?
See https://github.com/dadoonet/fsriver#debug-mode
Also, which versions are you using?
--
David ;-)
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs
Le 17 sept. 2014 à 01:49, Preeti Jain a écrit :
Hi,
I'm using elasticsearch version 1.0.1 that is installed on l
Looks like a bug in hive which passes a null progressable to the underlying
task. I would recommend upgrading hive to 0.10 or better ( and hadoop to
1.2.1 while at it).
On Sep 16, 2014 8:27 PM, "ibmuser1" wrote:
> Hi, my hadoop version is 1.1.1 and hive version is 0.9.0 (biginsights
> installatio
Please upgrade to version 2.0.1
On 9/17/14 1:18 AM, Jinyuan Zhou wrote:
I have confirmed with both elasticsearch hive and easticsearcg mr, If both
below situation happens, , EsOutFormat
produces invalid header for bulk indexing.
1. es.resouce contains data to be extracted from doucment
2.
Hi,
I'm using elasticsearch version 1.0.1 that is installed on linux machine.
I have created fs river to index content from file system. The river
definition is
curl -XPUT 'http://localhost:9200/_river/riverTest/_meta' -d '{
"type": "fs",
"fs": {
"url": " /data01/test/NewVehicleFiles1/",
I have observed that elastic search defaults the search thread pool to 3 X
#of CPUs and even if you increase this to a fix # it does not really help
as the threads start sharing the CPU cycles.
Does this mean that to get same performance results for more concurrent
searches
I either have to s
I have confirmed with both elasticsearch hive and easticsearcg mr, If both
below situation happens, , EsOutFormat produces invalid header for bulk
indexing.
1. es.resouce contains data to be extracted from doucment
2. es.mapping.id set to be one of field sin document
I looked at the code
I have a query with a nested boolean (boolean within a boolean) filter with
a should clause that performs really terribly. But if I move the nested
query up to top level, it performs as much as 50x faster. I am struggling
to understand why this is the case. Here are the 2 forms:
https://gist.g
I have logstash indicies that go back thirty days. I have logs in those
indices from today.
If I do a search with:
"size": 500,
"sort": [
{
"@timestamp": {
"order": "desc",
"ignore_unmapped": true
}
}
]
I don't get any logs from today. If I limit the s
Hi Matt,
Thanks for your quick response. However neither worked for us. In our case,
we set shard_size to 50K (option1 ), it is still missing documents. The
cluster became unstable if we try to further increase it. We cannot use
shard_min_doc_count_value, because even it is one hit, its value u
Hi Yifan,
Nothing dynamic, but you can increase the number of terms collected on each
shard to increase the accuracy [1]. Might also want to play with the
shard_min_doc_count value if you know certain shards have a low hit count
and are throwing off the aggregations [2].
[1]
http://www.elasticse
Thanks Mark. Thought this was that list. :-)
On Tuesday, September 16, 2014 3:08:45 PM UTC-5, Mark Walkom wrote:
>
> You should ask this over on the logstash list -
> https://groups.google.com/forum/?hl=en-GB#!forum/logstash-users :)
>
> Regards,
> Mark Walkom
>
> Infrastructure Engineer
> Camp
You should ask this over on the logstash list -
https://groups.google.com/forum/?hl=en-GB#!forum/logstash-users :)
Regards,
Mark Walkom
Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com
On 17 September 2014 06:04, Marty Hillman wrote:
> I ev
I even bought the book and rebuilt my test environment servers from
scratch, but I still have the same issues.
On the central server, I have redis, logstash 1.4 and elasticsearch 1.3
installed - all from apt repositories. I verified that all services are
started and I can curl results from the
By default ES uses a discovery method that allows any node with the same
cluster name to join an existing node with the same cluster name, thereby
forming one cluster.
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-discovery-zen.html
and you want to look at unicast di
It seems to be a common problem that the top N results returned from an
aggregation query is inaccurate due to uneven distribution of matching
documents on different shards, because ES will collect top N buckets from
each shard no matter actually how many hits are on each shard. It is very
ofte
I'm trying to track down an issue where 2 simple documents I'm testing are
being ranked quite a bit differently.
For testing purposes, I'm only searching against one field, "keywords". The
only word in that field for both documents is "jefferson". However, when I
search for the word "jefferson"
Ah, I just found the n/2+1 recommendation, so I expect I need to set it to
3.
On Tuesday, September 16, 2014 11:30:38 AM UTC-7, Tim Heikell wrote:
>
> Thanks for the reply Jörg. I have discovery.zen.minimum_master_nodes=2.
> Should it be something different?
>
> On Tuesday, September 16, 2014 11
Thanks for the reply Jörg. I have discovery.zen.minimum_master_nodes=2.
Should it be something different?
On Tuesday, September 16, 2014 11:21:16 AM UTC-7, Jörg Prante wrote:
>
> It looks like you did not configure minimum_master_nodes
>
> Jörg
>
> On Tue, Sep 16, 2014 at 8:00 PM, Tim Heikell >
It looks like you did not configure minimum_master_nodes
Jörg
On Tue, Sep 16, 2014 at 8:00 PM, Tim Heikell
wrote:
> We are prepping to launch our app into production and seem to be having
> some stability issues. We have a cluster of 4 VMs on Azure that all use the
> Azure plugin for discovery.
We are prepping to launch our app into production and seem to be having
some stability issues. We have a cluster of 4 VMs on Azure that all use the
Azure plugin for discovery. Most of the time it works as expected, but
sometimes it looses its mind. This morning for example, I made adjustments
t
Thank you.! I got that working.
On Tuesday, September 16, 2014 7:25:39 AM UTC-7, pawansharma2045 wrote:
>
> So you need to restart that node.
>
> On Tue, Sep 16, 2014 at 12:46 AM, shriyansh jain > wrote:
>
>> Hi,
>>
>> I am getting the following error in the elasticsearch log file. I have a
>> c
I am working with Analytic of events, I use hadoop to process the logs and
store some results in Mysql. This did not work now due to scalability
issues as logs are keep coming daily.
We need to show stats per year, month, week, day, hour along with filtering
capability
Our samples can grow for
Hi, my hadoop version is 1.1.1 and hive version is 0.9.0 (biginsights
installation). I am trying to push data from existing hive table(s) into
elasticsearch. My job fails with the following error. I copied hive
script as well below the error. Not sure what I am doing wrong. Can you
help?
Oh. Sorry :-)
On Mon, Sep 15, 2014 at 3:27 AM, Mark Walkom
wrote:
> You probably want to put this in your own thread :)
>
> Regards,
> Mark Walkom
>
> Infrastructure Engineer
> Campaign Monitor
> email: ma...@campaignmonitor.com
> web: www.campaignmonitor.com
>
> On 15 September 2014 06:55, SAUR
Hello,
I have the exact same issue. I wonder how to get full strings instead of
their stems which is not what I expect from a "suggest" query. Don't have
any solution yet.
--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from t
A solution was found here
http://stackoverflow.com/questions/22071198/adding-mapping-to-a-type-from-java-how-do-i-do-it
On Mon, Sep 15, 2014 at 4:16 PM, Jack Park wrote:
> I got this on 1.2.2 and found on the web that it was a bug. So, I upgraded
> to 1.3.2 and got the same bug.
>
> There was a
Hadn't looked at Jakson for a while but it seems to do both XML and CSV
(limited to json that represents tabular data)
On Tuesday, 16 September 2014 10:48:58 UTC-4, John Smith wrote:
>
> Yep, already doing that part actually...
>
> Was just wondering I guess the best way to deserialize from json
Hi guys,
I have objects with 3 fields of type array containing a large amount of
integers.
these integers are mutually excluded between fields = if an integer is in
the field1, it can't be in the field2 or field3, and vise-versa
for instance
object_1: {
field1: [1,4,5,8],
field2: [2,6,7
When it comes to JSON, Jackson should be at the top of your list. It's an excellent library and it has plenty of support
for XML [1]
[1] https://github.com/FasterXML/jackson-dataformat-xml
On 9/16/14 5:48 PM, John Smith wrote:
Yep, already doing that part actually...
Was just wondering I gue
Yep, already doing that part actually...
Was just wondering I guess the best way to deserialize from json to xml for
instance.
I suppose it's slightly off topic but what are some good json to xml
converters.
On Tuesday, 16 September 2014 10:23:05 UTC-4, David Pilato wrote:
>
> You need to use
Hi,
(let me know if this is not the right place to post ElasticSearch.Net
questions).
I'm indexing a document of type "User" through ElasticSearch.Net with this
command (key is a string guid):
client.Index(index, "user", key, user);
This invokes the serializer and stores the json in my ES clu
So you need to restart that node.
On Tue, Sep 16, 2014 at 12:46 AM, shriyansh jain
wrote:
> Hi,
>
> I am getting the following error in the elasticsearch log file. I have a
> cluster of 2 elasticsearch nodes, and have a setup of ELK stack with redis
> as a buffer. Everything was running fine but
Sorry for bumping this, but I'm a little stumped here.
We have some nodes that are evicting fielddata cache entries for seemingly
no reason:
1) we've set indices.fielddata.cache.size to 10gb
2) the metrics from the node stats endpoint show that the
indices.fielddata.memory_size_in_bytes never ex
You need to use the scan and scroll API for that.
See
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-request-scroll.html#scroll-scan
This class could help you in Java:
https://github.com/elasticsearch/elasticsearch/blob/master/src/test/java/org/elasticsearch/search
We are also facing this kind of issue in es version 1.1.1.
Some node gets disconnected and while analyzing the logs in that
disconnected node we got a lot connection time out error. So sometime
this issue gets solved by restarting the master node, but sometime we may
need to restart the whole c
In Gibhub under issues [1] or in the release notes for the 2.0.1 release. Most
likely, you are facing issue #210.
[1]
https://github.com/elasticsearch/elasticsearch-hadoop/issues?q=is%3Aissue+label%3Av2.0.1+is%3Aclosed
On 9/16/14 4:52 PM, Joe,Yu wrote:
On Tue, Sep 16, 2014 at 7:01 PM, Costin
Also it has to be done on the back end so JAVA it is...
On Tuesday, 16 September 2014 10:04:44 UTC-4, John Smith wrote:
>
> Hi, building some sort of internal tool to export data from Elasticsearch
> and I would liek to offer csv or XML.
>
> Just wondering what options there are...
>
>
> Bassical
Hi, building some sort of internal tool to export data from Elasticsearch
and I would liek to offer csv or XML.
Just wondering what options there are...
Bassically a user can login to a front end (No I cannot use what is out
there, it's only a small portion of a larger tool within the organiza
Hi,
I'm newbie with Elastic search. I'm validating Elasticsearch regarding our
needs.
Lets say I want to monitor disk usage of my VMs.
- vm1 and vm2 are in Platform PF_A, vm3 is in platform PF_B
The mapping I declared (can be pasted in sense)
PUT /example_201408/vm/_mapping
{
"_timestamp
On Tue, Sep 16, 2014 at 7:01 PM, Costin Leau wrote:
> Hi,
>
> Upgrade to es-hadoop 2.0.1.
> The error is caused by the fact that you have nodes within the ES cluster
> without a HTTP/REST point. These are now properly excluded though note, it
> means they will not be used by es-hadoop.
> As an a
So I have 1 ELK server setup and working just fine IP is 172.16.40.28. We
wanted to build a second one to log different servers and for several
reasons keep the data seperate. So I built the new server and setup ELK
again, all seems fine. The IP of the new server is 172.16.40.29. When I go
to t
I have following scanerio
SHOP1 sells : apple laptop apple ipad apple phone
SHOP2 sells : apple laptop
SHOP3 sells : HP laptop
i wanted to generate keyword for each shop sells. such that "apple ipad"
"ipad apple" should show only SHOP1 not SHOP2.
How can i generate searchable keyword by
Hi List,
>From the looks of it everything is possible but I still have some
questions. My application consist of events being upserted that expire
after 30 seconds and doing aggregations on those. I always filter on
user_id which is also the routing_value.
event_fields =
{"user_id","timestamp
Hello!
I'm trying to create a query, that would return the last(sorted by
timestamp) 10 hits. I'm using logstash to parse and index my log files...
I tried 2 different queries:
{
"query" : {
"filtered" : {
"query": {"match" : {"user" : "abc"}},
"query":
I have two types stored in an index: locations and activities.An activity
has a 'relation' to a location - ie. an activity takes place at a location.
Is it possible to get a location search result set that includes the count
of activities at each location? Sort of like annotating each locatio
You need to wrap the has_parent query in the query part of the filtered
query:
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-filtered-query.html#query-dsl-filtered-query
I don't see how this query could have worked in 0.90.5, since the format is
incorrect, but if
Any Suggestion??
On Monday, 15 September 2014 23:31:26 UTC+5:30, Roopendra Vishwakarma wrote:
>
> In elasticsearch 1.3.0 *filtered* not working with *has_parent*. In
> *elasticsearch
> 0.90.5* its working fine.
>
> I am using below query. In this query I need add filtered inside
> *has_parent-
Hi,
Upgrade to es-hadoop 2.0.1.
The error is caused by the fact that you have nodes within the ES cluster without a HTTP/REST point. These are now
properly excluded though note, it means they will not be used by es-hadoop.
As an alternative, consider enabling HTTP on all your data nodes.
On 9
Hello list,
I have 4 node ES cluster and 6 node CDH running in the lab.
The Hive job is as below:
hive job===
CREATE TABLE logs (type STRING, time STRING, ext STRING, ip STRING, req
STRING, res INT, bytes INT, phpmem INT, agent STRING)
ROW FORMAT DELIMITED FIELDS TERMINATED B
Can you manually test all of that using telnet?
Regards,
Mark Walkom
Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com
On 16 September 2014 20:09, Abhishek Aggarwal wrote:
> Thanks for the reply. I am facing this error intermittently. Transp
Thanks for the reply. I am facing this error intermittently. Transport
Client works fine sometimes - so it rules out firewall or port related
issues.
I have only one ES node (version 1.1.1)
- Firewall is not configured
- TCP and UDP on port 9300 are open
- sniff is disabled (I 'm using default t
Maybe you use a network filter / firewall which is misconfigured - no
connection is possible, everything seems to time out.
You must open TCP and UDP on port 9300 on all the hosts of the cluster
nodes if you use TransportClient.
Also check if your network can operate regarding other nodes, if you
I am connecting to single instance of Elastic Search server remotely via
Transport client.
In my web application which makes use of Transport client, i am see
following messages in the logs:
I have checked, my network connection is proper and ES server is up. But
still getting these messages i
If you want to use the filter parser plugin - I think you mean
https://github.com/lmenezes/elasticsearch-terms-fetch-filter-plugin - then
why don't you simply extend the plugin and build a new plugin from that
codebase?
>From what I understand is you somehow want to modify the search action core
c
Hello,
I've picked up a great little utility called wirbelsturm
(https://github.com/miguno/wirbelsturm). With it I've managed to automate
the creation of Vagrant backed VMs for a large chunk of my infrastructure
without much pain. Then I got to elasticsearch. I've tried a fe variations
of the
Just saw that the query profiler can not show what the shard execution
times are, so maybe this is not a big help.
Jörg
On Tue, Sep 16, 2014 at 9:24 AM, joergpra...@gmail.com <
joergpra...@gmail.com> wrote:
> If you are sure the spikes are caused by the JVM, I recommend to attach a
> profiler to
If you are sure the spikes are caused by the JVM, I recommend to attach a
profiler to the JVM, then you can monitor the code.
On JVM level, it is hard to trace queries, so maybe you want to test out
bleeding edge? Here is a query profiler:
https://github.com/elasticsearch/elasticsearch/pull/6699
You can not bind the same port to 2 IP.
This should work:
network.host: 192.168.1.213
See details at
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-network.html#modules-network
HTH
--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticse
You could just check for the response code 500, and you're done, no need to
capture streams.
Jörg
On Tue, Sep 16, 2014 at 12:53 AM, Alex Roytman wrote:
> I guess I could but it would mean passing a response wrapper to capture
> output stream and then copy it to real request or discard it in cas
Hi,
I have 2 nodes with each 2 network interfaces.
One of the networks is public and the other is private.
I want to use elasticsearch only on the private network and for convenience
also on the loopback devices.
I have tried multiple ways in the yml file:
network.bind_host: [ "192.168.1.213" ,
61 matches
Mail list logo