Not able to receive data in spark from rsyslog

2015-12-03 Thread masoom alam
I am getting am error that I am not able receive data in spark streaming
application from spark.please help with any pointers.
9 - java.net.ConnectException: Connection refused
at java.net.PlainSocketImpl.socketConnect(Native Method)
at
java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
at
java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
at
java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:579)
at java.net.Socket.connect(Socket.java:528)
at java.net.Socket.(Socket.java:425)
at java.net.Socket.(Socket.java:208)
at
org.apache.spark.streaming.dstream.SocketReceiver.receive(SocketInputDStream.scala:73)
at
org.apache.spark.streaming.dstream.SocketReceiver$$anon$2.run(SocketInputDStream.scala:59)

15/12/04 02:21:29 INFO ReceiverSupervisorImpl: Stopped receiver 0

However from nc -lk 999 gives the data which is received perfectlyany
clue...

Thanks


Using spark in cluster mode

2015-10-20 Thread masoom alam
Dear all

I want to setup spark in cluster mode. The problem is that each worker node
is looking for a file to process.in its local directory.is it
possible to setup some thing hdfs so that each worker node take  its part
of a file from hdfsany good tutorials for this?

Thanks


FP-growth on stream data

2015-09-27 Thread masoom alam
Is it possible to run FP-growth on stream data in its current versionor
a way around?

I mean is it possible to use/augment the old tree with the new incoming
data and find the new set of frequent patterns?

Thanks


Scala api end points

2015-09-24 Thread masoom alam
Hi everyone

I am new to Scala. I have a written an application using scala in spark
Now we want to interface it through rest api end points..what is the
best choice with usplease share ur experiences

Thanks


Spark Taking too long on K-means clustering

2015-08-27 Thread masoom alam
HI every one,

I am trying to run KDD data set - basically chapter 5 of the Advanced
Analytics with Spark book. The data set is of 789MB, but Spark is taking
some 3 to 4 hours. Is it normal behaviour.or some tuning is required.
The server RAM is 32 GB, but we can only give 4 GB RAM on 64 bit Ubuntu to
Java

Please guide.

Thanks


Re: Not albe to run FP-growth Example

2015-06-15 Thread masoom alam
artifactIdistack-commons-runtime/artifactId
groupIdcom.sun.istack/groupId
/exclusion
/exclusions
/dependency
dependency
groupIdorg.codehaus.groovy/groupId
artifactIdgroovy-all/artifactId
version2.3.7/version
scopeprovided/scope
/dependency
dependency
groupIdorg.scalatest/groupId
artifactIdscalatest_2.10/artifactId
version2.2.1/version
scopetest/scope
/dependency
/dependencies
properties
sbt.project.namemllib/sbt.project.name
/properties
/project

Any clues?

On Sun, Jun 14, 2015 at 8:20 PM, masoom alam masoom.a...@wanclouds.net
wrote:

 *Getting the following error:*

 [INFO]

 [INFO]
 
 [INFO] Building example 0.0.1
 [INFO]
 
 Downloading:
 http://repo.maven.apache.org/maven2/org/apache/spark/spark-mllib_2.10/1.4.0/spark-mllib_2.10-1.4.0.pom
 [INFO]
 
 [INFO] BUILD FAILURE
 [INFO]
 
 [INFO] Total time: 41.561s
 [INFO] Finished at: Mon Jun 15 08:17:43 PKT 2015
 [INFO] Final Memory: 6M/16M
 [INFO]
 
 [ERROR] Failed to execute goal on project learning-spark-mini-example:
 Could not resolve dependencies for project
 com.oreilly.learningsparkexamples.mini:learning-spark-mini-example:jar:0.0.1:
 Failed to collect dependencies for
 [org.apache.spark:spark-core_2.10:jar:1.3.0 (provided),
 org.apache.spark:spark-mllib_2.10:jar:1.4.0 (compile)]: Failed to read
 artifact descriptor for org.apache.spark:spark-mllib_2.10:jar:1.4.0: Could
 not transfer artifact org.apache.spark:spark-mllib_2.10:pom:1.4.0 from/to
 central (http://repo.maven.apache.org/maven2): repo.maven.apache.org:
 Unknown host repo.maven.apache.org - [Help 1]
 [ERROR]
 [ERROR] To see the full stack trace of the errors, re-run Maven with the
 -e switch.
 [ERROR] Re-run Maven using the -X switch to enable full debug logging.
 [ERROR]
 [ERROR] For more information about the errors and possible solutions,
 please read the following articles:
 [ERROR] [Help 1]
 http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException




 _

 *My POM file is as follows:-*

 ?xml version=1.0 encoding=UTF-8?
 !--project xmlns=http://maven.apache.org/POM/4.0.0;
  xmlns:xsi=http://www.w3.org/2001/XMLSchema-instance;
  xsi:schemaLocation=http://maven.apache.org/POM/4.0.0
 http://maven.apache.org/xsd/maven-4.0.0.xsd;
 modelVersion4.0.0/modelVersion

 groupIdcom.oreilly.learningsparkexamples.mini/groupId
 artifactIdlearning-spark-mini-example/artifactId
 version1.0-SNAPSHOT/version


 /project
  --

 project
 groupIdcom.oreilly.learningsparkexamples.mini/groupId
 artifactIdlearning-spark-mini-example/artifactId
 modelVersion4.0.0/modelVersion
 nameexample/name
 packagingjar/packaging
 version0.0.1/version
 dependencies
 dependency !-- Spark dependency --
 groupIdorg.apache.spark/groupId
 artifactIdspark-core_2.10/artifactId
 version1.3.0/version
 scopeprovided/scope
 /dependency
 dependency !-- Spark dependency --
 groupIdorg.apache.spark/groupId
 artifactIdspark-mllib_2.10/artifactId
 version1.4.0/version
 scopeprovided/scope
 /dependency
 /dependencies
 properties
 java.version1.7/java.version
 /properties
 build
 pluginManagement
 plugins
 plugin groupIdorg.apache.maven.plugins/groupId
 artifactIdmaven-compiler-plugin/artifactId
 version3.1/version
 configuration
 source${java.version}/source
 target${java.version}/target
 /configuration
 /plugin
 /plugins
 /pluginManagement
 /build
 /project

 ___

 *I have noticed that it tries to download the following file*:
 http://repo.maven.apache.org/maven2/org/apache/spark/spark-mllib_2.10/1.4.0/spark-mllib_2.10-1.4.0.pom
  *which
 is available*

 Any pointers?

 Thanks for the help.




 On Sun, Jun 14, 2015 at 5:24 AM, masoom alam masoom.a...@wanclouds.net
 wrote:

 Thanks a lot. Will try in a while n update

 Thanks again
 On Jun 14, 2015 5:13 PM, Sonal Goyal sonalgoy...@gmail.com wrote:

 Try with spark-mllib_2.10 as the artifactid
 On Jun 14, 2015 12:02 AM

Re: Not albe to run FP-growth Example

2015-06-14 Thread masoom alam
These two imports are missing and thus FP-growth is not compiling...

import org.apache.spark.*mllib.fpm.FPGrowth*;
import org.apache.spark.*mllib.fpm.FPGrowthModel*;

How to include the dependency in the POM file?

On Sat, Jun 13, 2015 at 4:26 AM, masoom alam masoom.a...@wanclouds.net
wrote:

 Thanks for the answer. Any example?
 On Jun 13, 2015 2:13 PM, Sonal Goyal sonalgoy...@gmail.com wrote:

 I think you need to add dependency to spark mllib too.
 On Jun 13, 2015 11:10 AM, masoom alam masoom.a...@wanclouds.net
 wrote:

 Hi every one,

 I am trying to run the FP growth example. I have tried to compile the
 following POM file:

 project
 groupIdcom.oreilly.learningsparkexamples.mini/groupId
 artifactIdlearning-spark-mini-example/artifactId
 modelVersion4.0.0/modelVersion
 nameexample/name
 packagingjar/packaging
 version0.0.1/version
 dependencies
 dependency !-- Spark dependency --
 groupIdorg.apache.spark/groupId
 artifactIdspark-core_2.10/artifactId
 version1.3.0/version
 scopeprovided/scope
 /dependency
 /dependencies
 properties
 java.version1.7/java.version
 /properties
 build
 pluginManagement
 plugins
 plugin groupIdorg.apache.maven.plugins/groupId
 artifactIdmaven-compiler-plugin/artifactId
 version3.1/version
 configuration
 source${java.version}/source
 target${java.version}/target
 /configuration
 /plugin
 /plugins
 /pluginManagement
 /build
 /project

 It successfully builds the project, but IDE is complaining
 that: Error:(29, 34) java: package org.apache.spark.mllib.fpm does not exist

 Just as a side note, I downloaded Version 1.3 of Spark so FP-growth
 algorithm should be part of it?

 Thanks.




Re: Not albe to run FP-growth Example

2015-06-14 Thread masoom alam
This is not working:

 dependency !-- Spark dependency --
groupIdorg.apache.spark.mlib/groupId
artifactIdspark-mlib/artifactId
!--version1.3.0/version --
scopeprovided/scope
/dependency



On Sat, Jun 13, 2015 at 11:56 PM, masoom alam masoom.a...@wanclouds.net
wrote:

 These two imports are missing and thus FP-growth is not compiling...

 import org.apache.spark.*mllib.fpm.FPGrowth*;
 import org.apache.spark.*mllib.fpm.FPGrowthModel*;

 How to include the dependency in the POM file?

 On Sat, Jun 13, 2015 at 4:26 AM, masoom alam masoom.a...@wanclouds.net
 wrote:

 Thanks for the answer. Any example?
 On Jun 13, 2015 2:13 PM, Sonal Goyal sonalgoy...@gmail.com wrote:

 I think you need to add dependency to spark mllib too.
 On Jun 13, 2015 11:10 AM, masoom alam masoom.a...@wanclouds.net
 wrote:

 Hi every one,

 I am trying to run the FP growth example. I have tried to compile the
 following POM file:

 project
 groupIdcom.oreilly.learningsparkexamples.mini/groupId
 artifactIdlearning-spark-mini-example/artifactId
 modelVersion4.0.0/modelVersion
 nameexample/name
 packagingjar/packaging
 version0.0.1/version
 dependencies
 dependency !-- Spark dependency --
 groupIdorg.apache.spark/groupId
 artifactIdspark-core_2.10/artifactId
 version1.3.0/version
 scopeprovided/scope
 /dependency
 /dependencies
 properties
 java.version1.7/java.version
 /properties
 build
 pluginManagement
 plugins
 plugin groupIdorg.apache.maven.plugins/groupId
 artifactIdmaven-compiler-plugin/artifactId
 version3.1/version
 configuration
 source${java.version}/source
 target${java.version}/target
 /configuration
 /plugin
 /plugins
 /pluginManagement
 /build
 /project

 It successfully builds the project, but IDE is complaining
 that: Error:(29, 34) java: package org.apache.spark.mllib.fpm does not 
 exist

 Just as a side note, I downloaded Version 1.3 of Spark so FP-growth
 algorithm should be part of it?

 Thanks.





Re: Not albe to run FP-growth Example

2015-06-14 Thread masoom alam
*Getting the following error:*

[INFO]

[INFO]

[INFO] Building example 0.0.1
[INFO]

Downloading:
http://repo.maven.apache.org/maven2/org/apache/spark/spark-mllib_2.10/1.4.0/spark-mllib_2.10-1.4.0.pom
[INFO]

[INFO] BUILD FAILURE
[INFO]

[INFO] Total time: 41.561s
[INFO] Finished at: Mon Jun 15 08:17:43 PKT 2015
[INFO] Final Memory: 6M/16M
[INFO]

[ERROR] Failed to execute goal on project learning-spark-mini-example:
Could not resolve dependencies for project
com.oreilly.learningsparkexamples.mini:learning-spark-mini-example:jar:0.0.1:
Failed to collect dependencies for
[org.apache.spark:spark-core_2.10:jar:1.3.0 (provided),
org.apache.spark:spark-mllib_2.10:jar:1.4.0 (compile)]: Failed to read
artifact descriptor for org.apache.spark:spark-mllib_2.10:jar:1.4.0: Could
not transfer artifact org.apache.spark:spark-mllib_2.10:pom:1.4.0 from/to
central (http://repo.maven.apache.org/maven2): repo.maven.apache.org:
Unknown host repo.maven.apache.org - [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e
switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions,
please read the following articles:
[ERROR] [Help 1]
http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException



_

*My POM file is as follows:-*

?xml version=1.0 encoding=UTF-8?
!--project xmlns=http://maven.apache.org/POM/4.0.0;
 xmlns:xsi=http://www.w3.org/2001/XMLSchema-instance;
 xsi:schemaLocation=http://maven.apache.org/POM/4.0.0
http://maven.apache.org/xsd/maven-4.0.0.xsd;
modelVersion4.0.0/modelVersion

groupIdcom.oreilly.learningsparkexamples.mini/groupId
artifactIdlearning-spark-mini-example/artifactId
version1.0-SNAPSHOT/version


/project
 --

project
groupIdcom.oreilly.learningsparkexamples.mini/groupId
artifactIdlearning-spark-mini-example/artifactId
modelVersion4.0.0/modelVersion
nameexample/name
packagingjar/packaging
version0.0.1/version
dependencies
dependency !-- Spark dependency --
groupIdorg.apache.spark/groupId
artifactIdspark-core_2.10/artifactId
version1.3.0/version
scopeprovided/scope
/dependency
dependency !-- Spark dependency --
groupIdorg.apache.spark/groupId
artifactIdspark-mllib_2.10/artifactId
version1.4.0/version
scopeprovided/scope
/dependency
/dependencies
properties
java.version1.7/java.version
/properties
build
pluginManagement
plugins
plugin groupIdorg.apache.maven.plugins/groupId
artifactIdmaven-compiler-plugin/artifactId
version3.1/version
configuration
source${java.version}/source
target${java.version}/target
/configuration
/plugin
/plugins
/pluginManagement
/build
/project
___

*I have noticed that it tries to download the following file*:
http://repo.maven.apache.org/maven2/org/apache/spark/spark-mllib_2.10/1.4.0/spark-mllib_2.10-1.4.0.pom
*which
is available*

Any pointers?

Thanks for the help.




On Sun, Jun 14, 2015 at 5:24 AM, masoom alam masoom.a...@wanclouds.net
wrote:

 Thanks a lot. Will try in a while n update

 Thanks again
 On Jun 14, 2015 5:13 PM, Sonal Goyal sonalgoy...@gmail.com wrote:

 Try with spark-mllib_2.10 as the artifactid
 On Jun 14, 2015 12:02 AM, masoom alam masoom.a...@wanclouds.net
 wrote:

 This is not working:

  dependency !-- Spark dependency --
 groupIdorg.apache.spark.mlib/groupId
 artifactIdspark-mlib/artifactId
 !--version1.3.0/version --
 scopeprovided/scope
 /dependency



 On Sat, Jun 13, 2015 at 11:56 PM, masoom alam masoom.a...@wanclouds.net
  wrote:

 These two imports are missing and thus FP-growth is not compiling...

 import org.apache.spark.*mllib.fpm.FPGrowth*;
 import org.apache.spark.*mllib.fpm.FPGrowthModel*;

 How to include the dependency in the POM file?

 On Sat, Jun 13, 2015 at 4:26 AM, masoom alam masoom.a...@wanclouds.net
  wrote:

 Thanks for the answer. Any example?
 On Jun 13, 2015 2:13 PM, Sonal Goyal sonalgoy...@gmail.com wrote:

 I think you need to add dependency to spark mllib too

Not albe to run FP-growth Example

2015-06-13 Thread masoom alam
Hi every one,

I am trying to run the FP growth example. I have tried to compile the
following POM file:

project
groupIdcom.oreilly.learningsparkexamples.mini/groupId
artifactIdlearning-spark-mini-example/artifactId
modelVersion4.0.0/modelVersion
nameexample/name
packagingjar/packaging
version0.0.1/version
dependencies
dependency !-- Spark dependency --
groupIdorg.apache.spark/groupId
artifactIdspark-core_2.10/artifactId
version1.3.0/version
scopeprovided/scope
/dependency
/dependencies
properties
java.version1.7/java.version
/properties
build
pluginManagement
plugins
plugin groupIdorg.apache.maven.plugins/groupId
artifactIdmaven-compiler-plugin/artifactId
version3.1/version
configuration
source${java.version}/source
target${java.version}/target
/configuration
/plugin
/plugins
/pluginManagement
/build
/project

It successfully builds the project, but IDE is complaining that: Error:(29,
34) java: package org.apache.spark.mllib.fpm does not exist

Just as a side note, I downloaded Version 1.3 of Spark so FP-growth
algorithm should be part of it?

Thanks.


Re: Not albe to run FP-growth Example

2015-06-13 Thread masoom alam
Thanks for the answer. Any example?
On Jun 13, 2015 2:13 PM, Sonal Goyal sonalgoy...@gmail.com wrote:

 I think you need to add dependency to spark mllib too.
 On Jun 13, 2015 11:10 AM, masoom alam masoom.a...@wanclouds.net wrote:

 Hi every one,

 I am trying to run the FP growth example. I have tried to compile the
 following POM file:

 project
 groupIdcom.oreilly.learningsparkexamples.mini/groupId
 artifactIdlearning-spark-mini-example/artifactId
 modelVersion4.0.0/modelVersion
 nameexample/name
 packagingjar/packaging
 version0.0.1/version
 dependencies
 dependency !-- Spark dependency --
 groupIdorg.apache.spark/groupId
 artifactIdspark-core_2.10/artifactId
 version1.3.0/version
 scopeprovided/scope
 /dependency
 /dependencies
 properties
 java.version1.7/java.version
 /properties
 build
 pluginManagement
 plugins
 plugin groupIdorg.apache.maven.plugins/groupId
 artifactIdmaven-compiler-plugin/artifactId
 version3.1/version
 configuration
 source${java.version}/source
 target${java.version}/target
 /configuration
 /plugin
 /plugins
 /pluginManagement
 /build
 /project

 It successfully builds the project, but IDE is complaining
 that: Error:(29, 34) java: package org.apache.spark.mllib.fpm does not exist

 Just as a side note, I downloaded Version 1.3 of Spark so FP-growth
 algorithm should be part of it?

 Thanks.