Re: Joining streaming data with static table data.

2017-12-11 Thread Rishi Mishra
You can do a join between streaming dataset and a static dataset. I would prefer your first approach. But the problem with this approach is performance. Unless you cache the dataset , every time you fire a join query it will fetch the latest records from the table. Regards, Rishitesh Mishra,

Spark 2.0.1 fails for provided hadoop

2016-08-30 Thread Rishi Mishra
Hi All, I tried to configure my Spark with MapR hadoop cluster. For that I built Spark 2.0 from source with hadoop-provided option. Then as per the document I set my hadoop libraries in spark-env.sh. However I get an error while SessionCatalog is getting created. Please refer below for exception

StatefulNetworkWordCount behaviour

2016-03-22 Thread Rishi Mishra
I am trying out StatefulNetworkWordCount from latest Spark master branch. When I run this example I see a odd behaviour. If in a batch a key is repeated the output stream prints for each repetition e.g. If I key in "ab" five times for input it will show like (ab,1) (ab,2) (ab,3) (ab,4) (ab,5)

Re: HashedRelation Memory Pressure on Broadcast Joins

2016-03-03 Thread Rishi Mishra
Hi Davies, When you say *"UnsafeRow could come from UnsafeProjection, so We should copy the rows for safety." *do you intend to say that the underlying state might change , because of some state update APIs ? Or its due to some other rationale ? Regards, Rishitesh Mishra, SnappyData .