ion 2
>
> }
>
> Does Spark run Action 1 & 2 run in parallel? ( some kind of a pass through
> the driver code and than start the execution)?
>
> if not than is using threads safe for independent actions/red's?
>
>
>
--
Regards,
Rishitesh Mishra,
SnappyData . (http://www.snappydata.io/)
https://in.linkedin.com/in/rishiteshmishra
to be restarted somewhere else.
>
> As I understand, the direct-kafka streaming model just computes offsets
> and relays the work to a KafkaRDD. How is the execution locality compared
> to the receiver-based approach?
>
> thanks, Gerard.
>
--
Regards,
Rishitesh Mishra,
Sn
duceByKey(reduceF)
rdd3.foreach(r => println(r))
You can always reconvert the obtained RDD after tranformation and
reduce to a DataFrame.
Regards,
Rishitesh Mishra,
SnappyData . (http://www.snappydata.io/)
https://www.linkedin.com/profile/view?id=AAIAAAIFdkMB_v-nolCrFH6_pKf9oH6tZD8Qlgo&tr
gards,
> Satish Chandra
>
> On Wed, Sep 23, 2015 at 4:02 PM, Rishitesh Mishra <
> rishi80.mis...@gmail.com> wrote:
>
>> Which version of Spark you are using ?? I can get correct results using
>> JdbcRDD. Infact there is a test suite precisely for this (JdbcRDDSuit
Which version of Spark you are using ?? I can get correct results using
JdbcRDD. Infact there is a test suite precisely for this (JdbcRDDSuite) .
I changed according to your input and got correct results from this test
suite.
On Wed, Sep 23, 2015 at 11:00 AM, satish chandra j wrote:
> HI All,
>
in
> key in order to perform the join.
>
>
>
> On Sat, Sep 19, 2015 at 12:55 PM, Rishitesh Mishra <
> rishi80.mis...@gmail.com> wrote:
>
>> Hi Reynold,
>> Can you please elaborate on this. I thought RDD also opens only an
>> iterator. Does it get materialize
Hi Reynold,
Can you please elaborate on this. I thought RDD also opens only an
iterator. Does it get materialized for joins?
Rishi
On Saturday, September 19, 2015, Reynold Xin wrote:
> Yes for RDD -- both are materialized. No for DataFrame/SQL - one side
> streams.
>
>
> On Thu, Sep 17, 2015 at
Hi Jem,
A simple way to get this is to use MapPartitionedRDD. Please see the below
code. For this you need to know your parent RDD's partition numbers that
you want to exclude. One drawback here is the new RDD will also invoke
similar number of tasks as parent RDDs as both the RDDs have same numbe
get assigned to worker node to read data from
> remote hadoop cluster? I am more interested to know how mapr NFS layer is
> accessed in parallel.
>
> -
> Swapnil
>
>
> On Thu, Aug 27, 2015 at 2:53 PM, Rishitesh Mishra <
> rishi80.mis...@gmail.com> wrote:
>
>>
Hi Swapnil,
Let me try to answer some of the questions. Answers inline. Hope it helps.
On Thursday, August 27, 2015, Swapnil Shinde
wrote:
> Hello
> I am new to spark world and started to explore recently in standalone
> mode. It would be great if I get clarifications on below doubts-
>
> 1. Dri
Hi Sateesh,
It is interesting to know , how did you determine that the Dstream runs on
a single core. Did you mean receivers?
Coming back to your question, could you not start disk io in a separate
thread, so that the sceduler can go ahead and assign other tasks ?
On 21 Aug 2015 16:06, "Sateesh Ka
I am not sure if you can view all RDDs in a session. Tables are maintained
in a catalogue . Hence its easier. However you can see the DAG
representation , which lists all the RDDs in a job , with Spark UI.
On 20 Aug 2015 22:34, "Dhaval Patel" wrote:
> Apologies
>
> I accidentally included Sp
13 matches
Mail list logo