Hi,
I've been following this thread for a while.
I'm trying to bring in a test strategy in my team to test a number of data
pipelines before production. I have watched Lars' presentation and find it
great. However I'm debating whether unit tests are worth the effort if
there are good job-level
> Thanks,
>
> Muthu
>
>
>
>
>
> On Wed, Mar 15, 2017 at 10:55 AM, vvshvv <vvs...@gmail.com> wrote:
>
> Hi muthu,
>
>
>
> I agree with Shiva, Cassandra also supports SASI indexes, which can
> partially replace Elasticsearch functionality.
>
>
va, Cassandra also supports SASI indexes, which can
>> partially replace Elasticsearch functionality.
>>
>> Regards,
>> Uladzimir
>>
>>
>>
>> Sent from my Mi phone
>> On Shiva Ramagopal <tr.s...@gmail.com>, Mar 15, 2017 5:57 PM wrote:
>&
Probably Cassandra is a good choice if you are mainly looking for a
datastore that supports fast writes. You can ingest the data into a table
and define one or more materialized views on top of it to support your
queries. Since you mention that your queries are going to be simple you can
define
Probably using a queue like RabbitMQ between Spark and ES could help - to
buffer the Spark output when ES can't keep up.
Some links:
1. ES-RabbitMQ River -
https://github.com/elastic/elasticsearch-river-rabbitmq/blob/master/README.md
2. Using RabbitMQ with ELK -
+1 for the Java love :-)
On 30-Jul-2016 4:39 AM, "Renato Perini" wrote:
> Not only very useful, but finally some Java love :-)
>
> Thank you.
>
>
> Il 29/07/2016 22:30, Jean Georges Perrin ha scritto:
>
>> Sorry if this looks like a shameless self promotion, but some of
Hi Lars,
Very pragmatic ideas around testing of Spark applications end-to-end!
-Shiva
On Fri, Mar 18, 2016 at 12:35 PM, Lars Albertsson wrote:
> I would recommend against writing unit tests for Spark programs, and
> instead focus on integration tests of jobs or pipelines of
How are you submitting/running the job - via spark-submit or as a plain old
Java program?
If you are using spark-submit, you can control the memory setting via the
configuration parameter spark.executor.memory in spark-defaults.conf.
If you are running it as a Java program, use -Xmx to set the