on this. This is in very early stages and
hacky and probably would require more testing.
Regards,
Sandeep Giri,
www.CloudxLab.com <http://www.cloudxlab.com/>
Hi,
Good Day.
Could you please let me know whether we can see spark logical or physical
plan while running spark job on the yarn cluster( Eg: like number of
stages)
Thanks in advance.
Thanks,
Giri
;
> On Wed, May 11, 2016 at 10:05 PM, Giri P <gpatc...@gmail.com> wrote:
>
>> I'm not using docker
>>
>> On Wed, May 11, 2016 at 8:47 AM, Raghavendra Pandey <
>> raghavendra.pan...@gmail.com> wrote:
>>
>>> By any chance, are you using docke
I'm not using docker
On Wed, May 11, 2016 at 8:47 AM, Raghavendra Pandey <
raghavendra.pan...@gmail.com> wrote:
> By any chance, are you using docker to execute?
> On 11 May 2016 21:16, "Raghavendra Pandey"
> wrote:
>
>> On 11 May 2016 02:13, "gpatcham"
method1 looks like this
reRDD.map(row =>method1(row,sc)).saveAsTextFile(outputDir)
reRDD has userId's
def method1(sc:SparkContext , userId: string){
sc.cassandraTable("Keyspace", "Table2").where("userid = ?" userId)
...do something
return "Test"
}
On Wed, Jan 20, 2016 at 11:00 AM,
I'm using spark cassandra connector to do this and the way we access
cassandra table is
sc.cassandraTable("keySpace", "tableName")
Thanks
Giri
On Mon, Jan 18, 2016 at 12:37 PM, Ted Yu <yuzhih...@gmail.com> wrote:
> Can you pass the properties which are needed for
Can we use @transient ?
On Mon, Jan 18, 2016 at 12:44 PM, Giri P <gpatc...@gmail.com> wrote:
> I'm using spark cassandra connector to do this and the way we access
> cassandra table is
>
> sc.cassandraTable("keySpace", "tableName")
>
> Thanks
> Gi
that would work.
>
> Doesn't seem to be good practice.
>
> On Mon, Jan 18, 2016 at 1:27 PM, Giri P <gpatc...@gmail.com> wrote:
>
>> Can we use @transient ?
>>
>>
>> On Mon, Jan 18, 2016 at 12:44 PM, Giri P <gpatc...@gmail.com> wrote:
>>
How to we reset the aggregated statistics to null?
Regards,
Sandeep Giri,
+1 347 781 4573 (US)
+91-953-899-8962 (IN)
www.KnowBigData.com. <http://KnowBigData.com.>
Phone: +1-253-397-1945 (Office)
[image: linkedin icon] <https://linkedin.com/company/knowbigdata> [image:
other site
an StreamRDD with aggregated count and keep doing a
fullouterjoin but didn't work. Seems like the StreamRDD gets reset.
Kindly help.
Regards,
Sandeep Giri
Yes, update state by key worked.
Though there are some more complications.
On Oct 30, 2015 8:27 AM, "skaarthik oss" <skaarthik@gmail.com> wrote:
> Did you consider UpdateStateByKey operation?
>
>
>
> *From:* Sandeep Giri [mailto:sand...@knowbigdata.com]
> *S
parkSubmit.scala:166)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:189)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:110)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Thanks & Regards,
Giri.
--
View this message in context:
use map-reduce.
On Fri, Sep 11, 2015, 14:32 Mishra, Abhishek
wrote:
> Hello ,
>
>
>
> Is there any way to query multiple collections from mongodb using spark
> and java. And i want to create only one Configuration Object. Please help
> if anyone has something
I think it should be possible by loading collections as RDD and then doing
a union on them.
Regards,
Sandeep Giri,
+1 347 781 4573 (US)
+91-953-899-8962 (IN)
www.KnowBigData.com. <http://KnowBigData.com.>
Phone: +1-253-397-1945 (Office)
[image: linkedin icon] <https://linkedin.co
Any idea what causing this error
15/08/28 21:03:03 WARN scheduler.TaskSetManager: Lost task 34.0 in stage
9.0 (TID 20, dtord01hdw0228p.dc.dotomi.net): java.lang.RuntimeException:
cannot find field message_campaign_id from
[0:error_error_error_error_error_error_error, 1:cannot_determine_schema,
...@databricks.com; user@spark.apache.org
can we run hive queries using spark-avro ?
In our case its not just reading the avro file. we have view in hive which
is based on multiple tables.
On Thu, Aug 27, 2015 at 9:41 AM, Giri P gpatc...@gmail.com wrote:
we are using hive1.1 .
I was able to fix below
can we run hive queries using spark-avro ?
In our case its not just reading the avro file. we have view in hive which
is based on multiple tables.
On Thu, Aug 27, 2015 at 9:41 AM, Giri P gpatc...@gmail.com wrote:
we are using hive1.1 .
I was able to fix below error when I used right version
queries in our
application
Any idea if this issue might be coz of querying across different schema
version of data ?
Thanks
Giri
On Thu, Aug 27, 2015 at 5:39 AM, java8964 java8...@hotmail.com wrote:
What version of the Hive you are using? And do you compile to the right
version of Hive when you
Thank you All. I have updated it to a little better version.
Regards,
Sandeep Giri,
+1 347 781 4573 (US)
+91-953-899-8962 (IN)
www.KnowBigData.com. http://KnowBigData.com.
Phone: +1-253-397-1945 (Office)
[image: linkedin icon] https://linkedin.com/company/knowbigdata [image:
other site icon
This statement is from the Spark's website itself.
Regards,
Sandeep Giri,
+1 347 781 4573 (US)
+91-953-899-8962 (IN)
www.KnowBigData.com. http://KnowBigData.com.
Phone: +1-253-397-1945 (Office)
[image: linkedin icon] https://linkedin.com/company/knowbigdata [image:
other site icon] http
i have prepared some interview questions:
http://www.knowbigdata.com/blog/interview-questions-apache-spark-part-1
http://www.knowbigdata.com/blog/interview-questions-apache-spark-part-2
please provide your feedback.
On Wed, Jul 29, 2015, 23:43 Pedro Rodriguez ski.rodrig...@gmail.com wrote:
You
Even for 2L records the MySQL will be better.
Regards,
Sandeep Giri,
+1-253-397-1945 (US)
+91-953-899-8962 (IN)
www.KnowBigData.com. http://KnowBigData.com.
[image: linkedin icon] https://linkedin.com/company/knowbigdata [image:
other site icon] http://knowbigdata.com [image: facebook icon
but on spark 0.9 we don't have these options
--num-executors: controls how many executors will be allocated
--executor-memory: RAM for each executor
--executor-cores: CPU cores for each executor
On Fri, Dec 12, 2014 at 12:27 PM, Sameer Farooqui same...@databricks.com
wrote:
Hi,
FYI - There
23 matches
Mail list logo