Hello Team,
I am trying to write a DataSet as parquet file in Append mode partitioned
by few columns. However since the job is time consuming, I would like to
enable DirectFileOutputCommitter (i.e by-passing the writes to temporary
folder).
Version of the spark i am using is 2.3.1.
Can someone
Hi All,
I have video surveillance data and this needs to be processed in Spark. I
am going through the Spark + OpenCV. How to load .mp4 images into an RDD ?
Can we directly do this or the video needs to be coverted to sequenceFile ?
Thanks,
Padma CH
gt; Priya,
>
> You wouldn't necessarily "use spark" to send the alert. Spark is in an
> important sense one library among many. You can have your application use
> any other library available for your language to send the alert.
>
> Marcin
>
> On Tue, Jul 12,
Is anyone resolved this ?
Thanks,
Padma CH
On Wed, Jun 22, 2016 at 4:39 PM, Priya Ch <learnings.chitt...@gmail.com>
wrote:
> Hi All,
>
> I am running Spark Application with 1.8TB of data (which is stored in Hive
> tables format). I am reading the data using HiveCont
Hi All,
I am running Spark Application with 1.8TB of data (which is stored in Hive
tables format). I am reading the data using HiveContect and processing it.
The cluster has 5 nodes total, 25 cores per machine and 250Gb per node. I
am launching the application with 25 executors with 5 cores each
Hello Team,
I am trying to perform join 2 rdds where one is of size 800 MB and the
other is 190 MB. During the join step, my job halts and I don't see
progress in the execution.
This is the message I see on console -
INFO spark.MapOutputTrackerMasterEndPoint: Asked to send map output
Hi All,
I have two RDDs A and B where in A is of size 30 MB and B is of size 7
MB, A.cartesian(B) is taking too much time. Is there any bottleneck in
cartesian operation ?
I am using spark 1.6.0 version
Regards,
Padma Ch
d use "lsof" on
> one of the spark executors (perhaps run it in a for loop, writing the
> output to separate files) until it fails and see which files are being
> opened, if there's anything that seems to be taking up a clear majority
> that might key you in on the culprit.
>
n Wednesday, January 6, 2016 4:00 AM, Priya Ch <
> learnings.chitt...@gmail.com> wrote:
>
>
> Running 'lsof' will let us know the open files but how do we come to know
> the root cause behind opening too many files.
>
> Thanks,
> Padma CH
>
> On Wed, J
the "too many open files"
> exception.
>
>
> On Tuesday, January 5, 2016 8:03 AM, Priya Ch <
> learnings.chitt...@gmail.com> wrote:
>
>
> Can some one throw light on this ?
>
> Regards,
> Padma Ch
>
> On Mon, Dec 28, 2015 at 3:59 PM, Priya Ch &l
, 2015 at 3:06 PM, Petr Novak <oss.mli...@gmail.com> wrote:
> add @transient?
>
> On Mon, Sep 21, 2015 at 11:27 AM, Priya Ch <learnings.chitt...@gmail.com>
> wrote:
>
>> Hello All,
>>
>> How can i pass sparkContext as a parameter to a method in an obje
; true. What is the possible solution for this ?
Is this a bug in Spark 1.3.0? Changing the scheduling mode to Stand-alone
or Mesos mode would work fine ??
Please someone share your views on this.
On Sat, Sep 12, 2015 at 11:04 PM, Priya Ch <learnings.chitt...@gmail.com>
wrote:
> Hello A
Hello All,
When I push messages into kafka and read into streaming application, I see
the following exception-
I am running the application on YARN and no where broadcasting the message
within the application. Just simply reading message, parsing it and
populating fields in a class and then
key.
Hope that helps.
Greetings,
Juan
2015-07-30 10:50 GMT+02:00 Priya Ch learnings.chitt...@gmail.com:
Hi All,
Can someone throw insights on this ?
On Wed, Jul 29, 2015 at 8:29 AM, Priya Ch learnings.chitt...@gmail.com
wrote:
Hi TD,
Thanks for the info. I have the scenario
Hi All,
Can someone throw insights on this ?
On Wed, Jul 29, 2015 at 8:29 AM, Priya Ch learnings.chitt...@gmail.com
wrote:
Hi TD,
Thanks for the info. I have the scenario like this.
I am reading the data from kafka topic. Let's say kafka has 3 partitions
for the topic. In my
. This will guard against multiple attempts to
run the task that inserts into Cassandra.
See
http://spark.apache.org/docs/latest/streaming-programming-guide.html#semantics-of-output-operations
TD
On Sun, Jul 26, 2015 at 11:19 AM, Priya Ch learnings.chitt...@gmail.com
wrote:
Hi All,
I have
Hi All,
I have a problem when writing streaming data to cassandra. Or existing
product is on Oracle DB in which while wrtiting data, locks are maintained
such that duplicates in the DB are avoided.
But as spark has parallel processing architecture, if more than 1 thread is
trying to write same
Hi All,
I have akka remote actors running on 2 nodes. I submitted spark application
from node1. In the spark code, in one of the rdd, i am sending message to
actor running on node1. My Spark code is as follows:
class ActorClient extends Actor with Serializable
{
import context._
val
Hi All,
We have set up 2 node cluster (NODE-DSRV05 and NODE-DSRV02) each is
having 32gb RAM and 1 TB hard disk capacity and 8 cores of cpu. We have set
up hdfs which has 2 TB capacity and the block size is 256 mb When we try
to process 1 gb file on spark, we see the following exception
Hi Team,
When I am trying to use DenseMatrix of breeze library in spark, its
throwing me the following error:
java.lang.noclassdeffounderror: breeze/storage/Zero
Can someone help me on this ?
Thanks,
Padma Ch
of breeze to the classpath? In Spark
1.0, we use breeze 0.7, and in Spark 1.1 we use 0.9. If the breeze
version you used is different from the one comes with Spark, you might
see class not found. -Xiangrui
On Fri, Oct 3, 2014 at 4:22 AM, Priya Ch learnings.chitt...@gmail.com
wrote:
Hi Team
Please accept the request
22 matches
Mail list logo