gt;
> On Jul 14, 2016, at 12:28, Balachandar R.A. <balachandar...@gmail.com>
> wrote:
>
> Hello Ted,
>>
>
>
> Thanks for the response. Here is the additional information.
>
>
>> I am using spark 1.6.1 (spark-1.6.1-bin-hadoop2.6)
>>
>> Here is
>
> Hello Ted,
>
Thanks for the response. Here is the additional information.
> I am using spark 1.6.1 (spark-1.6.1-bin-hadoop2.6)
>
>
>
> Here is the code snippet
>
>
>
>
>
> JavaRDD add = jsc.parallelize(listFolders, listFolders.size());
>
> JavaRDD test = add.map(new
Hello
In one of my use cases, i need to process list of folders in parallel. I
used
Sc.parallelize (list,list.size).map(" logic to process the folder").
I have a six node cluster and there are six folders to process. Ideally i
expect that each of my node process one folder. But, i see that a
Hello,
I have one apache spark based simple use case that process two datasets.
Each dataset takes about 5-7 min to process. I am doing this processing
inside the sc.parallelize(datasets){ } block. While the first dataset is
processed successfully, the processing of dataset is not started by
{ iter =>
> val folder = iter.next
> val status: Int =
> Seq(status).toIterator
> }
>
> On Jun 30, 2016, at 16:42, Balachandar R.A. <balachandar...@gmail.com>
> wrote:
>
> Hello,
>
> I have some 100 folders. Each folder contains 5 files. I have an
titions { iter =>
> val folder = iter.next
> val status: Int =
> Seq(status).toIterator
> }
>
> On Jun 30, 2016, at 16:42, Balachandar R.A. <balachandar...@gmail.com>
> wrote:
>
> Hello,
>
> I have some 100 folders. Each folder contains 5 files. I have
Hello,
I have some 100 folders. Each folder contains 5 files. I have an executable
that process one folder. The executable is a black box and hence it cannot
be modified.I would like to process 100 folders in parallel using Apache
spark so that I should be able to span a map task per folder. Can
I am new to GraphX and exploring example flight data analysis found on
online.
http://www.sparktutorials.net/analyzing-flight-data:-a-gentle-introduction-to-graphx-in-spark
I tried calculating inDegrees (understand how many incoming flights to an
airport) but I see value which corresponds to
Thanks... Will look into that
- Bala
On 28 January 2016 at 15:36, Sahil Sareen <sareen...@gmail.com> wrote:
> Try Neo4j for visualization, GraphX does a pretty god job at distributed
> graph processing.
>
> On Thu, Jan 28, 2016 at 12:42 PM, Balachandar R.A. <
> balacha
HI
I am new to spark MLlib and machine learning. I have a csv file that
consists of around 100 thousand rows and 20 columns. Of these 20 columns,
10 contains string values. Each value in these columns are not necessarily
unique. They are kind of categorical, that is, the values could be one
> Can't you do a simple dictionnary and map those values to numbers?
>
> Cheers
> Guillaume
>
> On 5 November 2015 at 09:54, Balachandar R.A. <balachandar...@gmail.com>
> wrote:
>
>> HI
>>
>>
>> I am new to spark MLlib and machine learning. I h
-- Forwarded message --
From: "Balachandar R.A." <balachandar...@gmail.com>
Date: 02-Nov-2015 12:53 pm
Subject: Re: Error : - No filesystem for scheme: spark
To: "Jean-Baptiste Onofré" <j...@nanthrax.net>
Cc:
> HI JB,
> Thanks for the respo
>> On 2 November 2015 at 14:59, Romi Kuntsman <r...@totango.com
>> <mailto:r...@totango.com>> wrote:
>>
>> except "spark.master", do you have "spark://" anywhere in your code
>> or config files?
>>
>>
mi Kuntsman*, *Big Data Engineer*
> http://www.totango.com
>
> On Mon, Nov 2, 2015 at 11:27 AM, Balachandar R.A. <
> balachandar...@gmail.com> wrote:
>
>>
>> -- Forwarded message --
>> From: "Balachandar R.A." <balachandar...@g
I made a stupid mistake it seems. I supplied the --master option to the
spark url in my launch command. And this error is gone.
Thanks for pointing out possible places for troubleshooting
Regards
Bala
On 02-Nov-2015 3:15 pm, "Balachandar R.A." <balachandar...@gmail.com> wro
Can someone tell me at what point this error could come?
In one of my use cases, I am trying to use hadoop custom input format. Here
is my code.
val hConf: Configuration = sc.hadoopConfiguration
hConf.set("fs.hdfs.impl",
classOf[org.apache.hadoop.hdfs.DistributedFileSystem].getName)
Hello,
I have developed a hadoop based solution that process a binary file. This
uses classic hadoop MR technique. The binary file is about 10GB and divided
into 73 HDFS blocks, and the business logic written as map process operates
on each of these 73 blocks. We have developed a
17 matches
Mail list logo