Re: Spark SQL 1.0.1 error on reading fixed length byte array

2014-08-03 Thread Pei-Lun Lee
Hi,

We have a PR to support fixed length byte array in parquet file.

https://github.com/apache/spark/pull/1737

Can someone help verifying it?

Thanks.

2014-07-15 19:23 GMT+08:00 Pei-Lun Lee :

> Sorry, should be SPARK-2489
>
>
> 2014-07-15 19:22 GMT+08:00 Pei-Lun Lee :
>
> Filed SPARK-2446
>>
>>
>>
>> 2014-07-15 16:17 GMT+08:00 Michael Armbrust :
>>
>> Oh, maybe not.  Please file another JIRA.
>>>
>>>
>>> On Tue, Jul 15, 2014 at 12:34 AM, Pei-Lun Lee  wrote:
>>>
 Hi Michael,

 Good to know it is being handled. I tried master branch (9fe693b5) and
 got another error:

 scala> sqlContext.parquetFile("/tmp/foo")
 java.lang.RuntimeException: Unsupported parquet datatype optional
 fixed_len_byte_array(4) b
 at scala.sys.package$.error(package.scala:27)
 at
 org.apache.spark.sql.parquet.ParquetTypesConverter$.toPrimitiveDataType(ParquetTypes.scala:58)
  at
 org.apache.spark.sql.parquet.ParquetTypesConverter$.toDataType(ParquetTypes.scala:109)
 at
 org.apache.spark.sql.parquet.ParquetTypesConverter$$anonfun$convertToAttributes$1.apply(ParquetTypes.scala:282)
  at
 org.apache.spark.sql.parquet.ParquetTypesConverter$$anonfun$convertToAttributes$1.apply(ParquetTypes.scala:279)
 ..

 The avro schema I used is something like:

 protocol Test {
 fixed Bytes4(4);

 record User {
 string name;
 int age;
 union {null, int} i;
 union {null, int} j;
 union {null, Bytes4} b;
 union {null, bytes} c;
 union {null, int} d;
 }
 }

 Is this case included in SPARK-2446
 ?


 2014-07-15 3:54 GMT+08:00 Michael Armbrust :

 This is not supported yet, but there is a PR open to fix it:
> https://issues.apache.org/jira/browse/SPARK-2446
>
>
> On Mon, Jul 14, 2014 at 4:17 AM, Pei-Lun Lee  wrote:
>
>> Hi,
>>
>> I am using spark-sql 1.0.1 to load parquet files generated from
>> method described in:
>>
>> https://gist.github.com/massie/7224868
>>
>>
>> When I try to submit a select query with columns of type fixed length
>> byte array, the following error pops up:
>>
>>
>> 14/07/14 11:09:14 INFO scheduler.DAGScheduler: Failed to run take at
>> basicOperators.scala:100
>> org.apache.spark.SparkDriverExecutionException: Execution error
>> at
>> org.apache.spark.scheduler.DAGScheduler.runLocallyWithinThread(DAGScheduler.scala:581)
>> at
>> org.apache.spark.scheduler.DAGScheduler$$anon$1.run(DAGScheduler.scala:559)
>> Caused by: parquet.io.ParquetDecodingException: Can not read value at
>> 0 in block -1 in file s3n://foo/bar/part-r-0.snappy.parquet
>> at
>> parquet.hadoop.InternalParquetRecordReader.nextKeyValue(InternalParquetRecordReader.java:177)
>> at
>> parquet.hadoop.ParquetRecordReader.nextKeyValue(ParquetRecordReader.java:130)
>> at
>> org.apache.spark.rdd.NewHadoopRDD$$anon$1.hasNext(NewHadoopRDD.scala:122)
>> at
>> org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39)
>> at
>> scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
>> at
>> scala.collection.Iterator$$anon$14.hasNext(Iterator.scala:388)
>> at
>> scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
>> at
>> scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:308)
>> at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>> at
>> scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>> at
>> scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
>> at
>> scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
>> at
>> scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
>> at scala.collection.TraversableOnce$class.to
>> (TraversableOnce.scala:273)
>> at scala.collection.AbstractIterator.to(Iterator.scala:1157)
>> at
>> scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
>> at
>> scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
>> at
>> scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
>> at
>> scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
>> at org.apache.spark.rdd.RDD$$anonfun$27.apply(RDD.scala:989)
>> at org.apache.spark.rdd.RDD$$anonfun$27.apply(RDD.scala:989)
>> at
>> org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1083)
>> at
>> org.apache.spark.scheduler.DAGScheduler.runLocallyWit

Re: -1s on pull requests?

2014-08-03 Thread Patrick Wendell
Sure thing - feel free to ping me off list if you need pointers. The script
just does string concatenation and a curl to post the comment... I think it
should be pretty accessible!

- Patrick


On Sun, Aug 3, 2014 at 9:12 PM, Nicholas Chammas  wrote:

> On Sun, Aug 3, 2014 at 11:29 PM, Patrick Wendell 
> wrote:
>
> Nick - Any interest in doing these? this is all doable from within the
>> spark repo itself because our QA harness scripts are in there:
>>
>> https://github.com/apache/spark/blob/master/dev/run-tests-jenkins
>>
>> If not, could you make a JIRA for them and put it under "Project Infra".
>>
> I'll make the JIRA and think about how to do this stuff. I'll have to
> understand what that run-tests-jenkins script does and see how easy it is
> to extend.
>
> Nick
> 
>


Re: -1s on pull requests?

2014-08-03 Thread Nicholas Chammas
On Sun, Aug 3, 2014 at 11:29 PM, Patrick Wendell  wrote:

Nick - Any interest in doing these? this is all doable from within the
> spark repo itself because our QA harness scripts are in there:
>
> https://github.com/apache/spark/blob/master/dev/run-tests-jenkins
>
> If not, could you make a JIRA for them and put it under "Project Infra".
>
I’ll make the JIRA and think about how to do this stuff. I’ll have to
understand what that run-tests-jenkins script does and see how easy it is
to extend.

Nick
​


Re: Scala 2.11 external dependencies

2014-08-03 Thread Patrick Wendell
Hey Anand,

Thanks for looking into this - it's great to see momentum towards Scala
2.11 and I'd love if this land in Spark 1.2.

For the external dependencies, it would be good to create a sub-task of
SPARK-1812 to track our efforts encouraging other projects to upgrade. In
certain cases (e.g. Kafka) there is fairly late-stage work on this already,
so we can e.g. link to those JIRA's as well. A good starting point is to
just go to their dev list and ask what the status is, most Scala projects
have put at least some thought into this already. Another thing we can do
is submit patches ourselves to those projects to help get them upgraded.
The twitter libraries, e.g., tend to be pretty small and also open to
external contributions.

One other thing in the mix here - Prashant Sharma has also spent some time
looking at this, so it might be good for you two to connect (probably off
list) and sync up. Prashant has contributed to many Scala projects, so he
might have cycles to go and help some of our dependencies get upgraded -
but I won't commit to that on his behalf :).

Regarding Akka - I shaded and published akka as a one-off thing:
https://github.com/pwendell/akka/tree/2.2.3-shaded-proto

Over time we've had to publish our own versions of a small number of
dependencies. It's somewhat high overhead, but it actually works quite well
in terms of avoiding some of the nastier dependency conflicts. At least
better than other alternatives I've seen such as using a shader build
plug-in.

Going forward, I'd actually like to track these in the Spark repo itself.
For instance, we have a bash script in the spark repo that can e.g. check
out akka, apply a few patches or regular expressions, and then you have a
fully shaded dependency that can be published to maven. If you wanted to
take a crack at something like that for akka 2.3.4, be my guest. I can help
with the actual publishing.

- Patrick


On Sat, Aug 2, 2014 at 6:04 PM, Anand Avati  wrote:

> We are currently blocked on non availability of the following external
> dependencies for porting Spark to Scala 2.11 [SPARK-1812 Jira]:
>
> - akka-*_2.11 (2.3.4-shaded-protobuf from org.spark-project). The shaded
> protobuf needs to be 2.5.0, and the shading is needed because Hadoop1
> specifically needs protobuf 2.4. Issues arising because of this
> incompatibility is already explained in SPARK-1812 Jira.
>
> - chill_2.11 (0.4 from com.twitter) for core
> - algebird_2.11 (0.7 from com.twitter) for examples
> - kafka_2.11 (0.8 from org.apache) for external/kafka and examples
> - akka-zeromq_2.11 (2.3.4 from com.typesafe, but probably not needed if a
> shaded-protobuf version is released from org.spark-project)
>
> First,
> Who do I pester to get org.spark-project artifacts published for the akka
> shaded-protobuf version?
>
> Second,
> In the past what has been the convention to request/pester external
> projects to re-release artifacts in a new Scala version?
>
> Thanks!
>


Re: Low Level Kafka Consumer for Spark

2014-08-03 Thread Patrick Wendell
I'll let TD chime on on this one, but I'm guessing this would be a welcome
addition. It's great to see community effort on adding new
streams/receivers, adding a Java API for receivers was something we did
specifically to allow this :)

- Patrick


On Sat, Aug 2, 2014 at 10:09 AM, Dibyendu Bhattacharya <
dibyendu.bhattach...@gmail.com> wrote:

> Hi,
>
> I have implemented a Low Level Kafka Consumer for Spark Streaming using
> Kafka Simple Consumer API. This API will give better control over the Kafka
> offset management and recovery from failures. As the present Spark
> KafkaUtils uses HighLevel Kafka Consumer API, I wanted to have a better
> control over the offset management which is not possible in Kafka HighLevel
> consumer.
>
> This Project is available in below Repo :
>
> https://github.com/dibbhatt/kafka-spark-consumer
>
>
> I have implemented a Custom Receiver consumer.kafka.client.KafkaReceiver.
> The KafkaReceiver uses low level Kafka Consumer API (implemented in
> consumer.kafka packages) to fetch messages from Kafka and 'store' it in
> Spark.
>
> The logic will detect number of partitions for a topic and spawn that many
> threads (Individual instances of Consumers). Kafka Consumer uses Zookeeper
> for storing the latest offset for individual partitions, which will help to
> recover in case of failure. The Kafka Consumer logic is tolerant to ZK
> Failures, Kafka Leader of Partition changes, Kafka broker failures,
>  recovery from offset errors and other fail-over aspects.
>
> The consumer.kafka.client.Consumer is the sample Consumer which uses this
> Kafka Receivers to generate DStreams from Kafka and apply a Output
> operation for every messages of the RDD.
>
> We are planning to use this Kafka Spark Consumer to perform Near Real Time
> Indexing of Kafka Messages to target Search Cluster and also Near Real Time
> Aggregation using target NoSQL storage.
>
> Kindly let me know your view. Also if this looks good, can I contribute to
> Spark Streaming project.
>
> Regards,
> Dibyendu
>


Compiling Spark master (6ba6c3eb) with sbt/sbt assembly

2014-08-03 Thread Larry Xiao
On the latest pull today (6ba6c3ebfe9a47351a50e45271e241140b09bf10) meet 
assembly problem.


$ ./sbt/sbt assembly
Using /usr/lib/jvm/java-7-oracle as default JAVA_HOME.
Note, this will be overridden by -java-home if it is set.
[info] Loading project definition from ~/spark/project/project
[info] Loading project definition from 
~/.sbt/0.13/staging/ec3aa8f39111944cc5f2/sbt-pom-reader/project
[warn] Multiple resolvers having different access mechanism configured 
with same name 'sbt-plugin-releases'. To avoid conflict, Remove 
duplicate project resolvers (`resolvers`) or rename publishing resolv

er (`publishTo`).
[info] Loading project definition from ~/spark/project
[info] Set current project to spark-parent (in build file:~/spark/)
[info] Compiling 372 Scala sources and 35 Java sources to 
~/spark/core/target/scala-2.10/classes...
[error] 
~/spark/core/src/main/scala/org/apache/spark/ui/jobs/JobProgressListener.scala:116: 
type mismatch;

[error]  found   : org.apache.spark.ui.jobs.TaskUIData
[error]  required: org.apache.spark.ui.jobs.UIData.TaskUIData
[error]   stageData.taskData.put(taskInfo.taskId, new 
TaskUIData(taskInfo))

[error]   ^
[error] 
~/spark/core/src/main/scala/org/apache/spark/ui/jobs/JobProgressListener.scala:134: 
type mismatch;

[error]  found   : org.apache.spark.ui.jobs.ExecutorSummary
[error]  required: org.apache.spark.ui.jobs.UIData.ExecutorSummary
[error]   val execSummary = 
execSummaryMap.getOrElseUpdate(info.executorId, new ExecutorSummary)

[error] ^
[error] 
~/spark/core/src/main/scala/org/apache/spark/ui/jobs/JobProgressListener.scala:163: 
type mismatch;

[error]  found   : org.apache.spark.ui.jobs.TaskUIData
[error]  required: org.apache.spark.ui.jobs.UIData.TaskUIData
[error]   val taskData = 
stageData.taskData.getOrElseUpdate(info.taskId, new TaskUIData(info))

[error] ^
[error] 
~/spark/core/src/main/scala/org/apache/spark/ui/jobs/JobProgressListener.scala:180: 
type mismatch;

[error]  found   : org.apache.spark.ui.jobs.ExecutorSummary
[error]  required: org.apache.spark.ui.jobs.UIData.ExecutorSummary
[error] val execSummary = 
stageData.executorSummary.getOrElseUpdate(execId, new ExecutorSummary)

[error] ^
[error] 
~/spark/core/src/main/scala/org/apache/spark/ui/jobs/StagePage.scala:109: type 
mismatch;
[error]  found   : org.apache.spark.ui.jobs.TaskUIData => 
Seq[scala.xml.Node]
[error]  required: org.apache.spark.ui.jobs.UIData.TaskUIData => 
Seq[scala.xml.Node]

[error] Error occurred in an application involving default arguments.
[error] taskHeaders, taskRow(hasInput, hasShuffleRead, 
hasShuffleWrite, hasBytesSpilled), tasks)

[error] ^
[error] 
~/spark/core/src/main/scala/org/apache/spark/ui/jobs/StagePage.scala:119: constructor 
cannot be instantiated to expected type;

[error]  found   : org.apache.spark.ui.jobs.TaskUIData
[error]  required: org.apache.spark.ui.jobs.UIData.TaskUIData
[error]   val serializationTimes = validTasks.map { case 
TaskUIData(_, metrics, _) =>

[error]  ^
[error] 
~/spark/core/src/main/scala/org/apache/spark/ui/jobs/StagePage.scala:120: not 
found: value metrics

[error] metrics.get.resultSerializationTime.toDouble
[error] ^

I think the code doesn't make correct reference to the updated structure.

"core/src/main/scala/org/apache/spark/ui/jobs/UIData.scala" is 
introduced in commit 72e9021eaf26f31a82120505f8b764b18fbe8d48


Larry

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: -1s on pull requests?

2014-08-03 Thread Patrick Wendell
>
>1. Include the commit hash in the "tests have started/completed"
>messages, so that it's clear what code exactly is/has been tested for
> each
>test cycle.
>

Great idea - I think this is easy to do given the current architecture. We
already have access to the commit ID in the same script that posts the
comments.

   2. "Pin" a message to the start or end of the PR that is updated with
>the status of the PR. "Testing not complete"; "New commits since last
>test"; "Tests failed"; etc. It should be easy for committers to get the
>status of the PR at a glance, without scrolling through the comment
> history.
>

This also is a good idea - I think this would be doable since the github
API allows us to edit comments, but it's a bit tricker. I think it would
require first making an API call to get the "status comment" ID and then
updating it.


>
> Nick
>

Nick - Any interest in doing these? this is all doable from within the
spark repo itself because our QA harness scripts are in there:

https://github.com/apache/spark/blob/master/dev/run-tests-jenkins

If not, could you make a JIRA for them and put it under "Project Infra".

- Patrick


spark

2014-08-03 Thread 조인수
-- 



*(주)사이버다임 프로세스마이닝 팀서울 강남구 논현동 278-3 프라임저축은행빌딩 6층조  인  수 대리*
회사전화 : 02-546-6990
휴대전화 : 010-4310-0826
전자우편 : auso...@cyberdigm.co.kr

This communication is confidential and intended only for the use of
the individual(s) to whom it is addressed.  The information contained in it
may be the subject of professional privilege or protected from disclosure
for other reasons.  If you are not the intended addressee, please delete
it, notify the sender, and do not disclose or reproduce any part of it
without specific consent. 상기 메시지는 의도된 수신자만 이용하도록 전송되었으며, 비밀정보와 특권정보가 포함되어
있을 수 있습니다. 권한없이 이 정보들을 열람, 사용, 공개,배포하는 것은 금지되어 있습니다. 만일 귀하가 의도된 수신자가 아닌 경우,
회신 메일로 송신자에게 연락을 취해주시고 원본 메시지와 모든 사본을 폐기하여 주시기 바랍니다. Disclaimer: The
statements and opinions expressed herein are my own and do not necessarily
reflect those of Cyberdigm Corporation. 상기 메시지에 표현된 모든 진술과 의견들은 전적으로 보내는 이의
책임에 의해 작성된 것이며, 반드시 ㈜사이버다임의 견해를 반영하지는 않습니다.


(send this email to subscribe)

2014-08-03 Thread Gurumurthy Yeleswarapu



Re: I would like to contribute

2014-08-03 Thread Josh Rosen
The Contributing to Spark guide on the Spark Wiki provides a good overview on 
how to start contributing:

https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark


On August 3, 2014 at 5:14:23 PM, pritish (prit...@nirvana-international.com) 
wrote:

Hi 

We would like to contribute to Spark but we are not sure how. We can offer 
project management, release management to begin with. Please advice on how to 
get engaged. 

Thank you!! 

Regards 
Pritish 
Nirvana International Inc. 
Big Data, Hadoop, Oracle EBS and IT Solutions 
VA - SWaM, MD - MBE Certified Company 
prit...@nirvana-international.com 
http://www.nirvana-international.com 

> On August 2, 2014 at 9:04 PM Anand Avati  wrote: 
> 
> 
> We are currently blocked on non availability of the following external 
> dependencies for porting Spark to Scala 2.11 [SPARK-1812 Jira]: 
> 
> - akka-*_2.11 (2.3.4-shaded-protobuf from org.spark-project). The shaded 
> protobuf needs to be 2.5.0, and the shading is needed because Hadoop1 
> specifically needs protobuf 2.4. Issues arising because of this 
> incompatibility is already explained in SPARK-1812 Jira. 
> 
> - chill_2.11 (0.4 from com.twitter) for core 
> - algebird_2.11 (0.7 from com.twitter) for examples 
> - kafka_2.11 (0.8 from org.apache) for external/kafka and examples 
> - akka-zeromq_2.11 (2.3.4 from com.typesafe, but probably not needed if a 
> shaded-protobuf version is released from org.spark-project) 
> 
> First, 
> Who do I pester to get org.spark-project artifacts published for the akka 
> shaded-protobuf version? 
> 
> Second, 
> In the past what has been the convention to request/pester external 
> projects to re-release artifacts in a new Scala version? 
> 
> Thanks! 

- 
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org 
For additional commands, e-mail: dev-h...@spark.apache.org 



I would like to contribute

2014-08-03 Thread pritish
Hi

We would like to contribute to Spark but we are not sure how. We can offer
project management, release management to begin with. Please advice on how to
get engaged.

Thank you!!

Regards
Pritish
Nirvana International Inc.
Big Data, Hadoop, Oracle EBS and IT Solutions
VA - SWaM, MD - MBE Certified Company
prit...@nirvana-international.com
http://www.nirvana-international.com

> On August 2, 2014 at 9:04 PM Anand Avati  wrote:
>
>
> We are currently blocked on non availability of the following external
> dependencies for porting Spark to Scala 2.11 [SPARK-1812 Jira]:
>
> - akka-*_2.11 (2.3.4-shaded-protobuf from org.spark-project). The shaded
> protobuf needs to be 2.5.0, and the shading is needed because Hadoop1
> specifically needs protobuf 2.4. Issues arising because of this
> incompatibility is already explained in SPARK-1812 Jira.
>
> - chill_2.11 (0.4 from com.twitter) for core
> - algebird_2.11 (0.7 from com.twitter) for examples
> - kafka_2.11 (0.8 from org.apache) for external/kafka and examples
> - akka-zeromq_2.11 (2.3.4 from com.typesafe, but probably not needed if a
> shaded-protobuf version is released from org.spark-project)
>
> First,
> Who do I pester to get org.spark-project artifacts published for the akka
> shaded-protobuf version?
>
> Second,
> In the past what has been the convention to request/pester external
> projects to re-release artifacts in a new Scala version?
>
> Thanks!

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: -1s on pull requests?

2014-08-03 Thread Nicholas Chammas
On Mon, Jul 21, 2014 at 4:44 PM, Kay Ousterhout 
wrote:

> This also happens when something accidentally gets merged after the tests
> have started but before tests have passed.
>

Some improvements to SparkQA  could help with
this. May I suggest:

   1. Include the commit hash in the "tests have started/completed"
   messages, so that it's clear what code exactly is/has been tested for each
   test cycle.
   2. "Pin" a message to the start or end of the PR that is updated with
   the status of the PR. "Testing not complete"; "New commits since last
   test"; "Tests failed"; etc. It should be easy for committers to get the
   status of the PR at a glance, without scrolling through the comment history.

Nick


Re:Re: Intellij IDEA can not recognize the MLlib package

2014-08-03 Thread jun
Got it and add spark-mllib libraryDependency:libraryDependencies += 
"org.apache.spark" %% "spark-mllib" % "1.0.0"

It works and many thanks!


BR
Kitaev
At 2014-08-03 05:09:23, "Sean Owen"  wrote:
>Yes, but it is nowhere in your project dependencies.
>
>On Sun, Aug 3, 2014 at 10:06 AM, jun  wrote:
>> Sorry the color is missing. the "mllib" is red word and "import" sentence is 
>> grey.>>import org.apache.spark.mllib.recommendation.ALS
>>
>> At 2014-08-03 05:03:31, jun"  wrote: >Hi, > > >I have 
>> started my spark exploration in intellij IDEA local model and want to focus 
>> on MLlib part. >but when I put some example codes in IDEA, It can not 
>> recognize mllib package, just loos like that: > > >> >> import 
>> org.apache.spark.SparkContext >>import 
>> org.apache.spark.mllib.recommendation.ALS >> > > >I hava configured the 
>> breeze in build.sbt file and also install the mingw gcc & gfortran lib. Here 
>> is my build.sbt: > > > build.sbt  >name := "SparkMLlibLocal" version 
>> := "1.0" resolvers += "Ooyala Bintray" at 
>> "http://dl.bintray.com/ooyala/maven"; resolvers += "Akka Repository" at 
>> "http://repo.akka.io/releases/"; libraryDependencies += "ooyala.cnd" % 
>> "job-server" % "0.3.1" % "provided" libraryDependencies += 
>> "com.github.fommil.netlib" % "all" % "1.1.2" libraryDependencies += 
>> "org.apache.spark" %% "spark-core" % "1.0.0" libraryDependencies ++= Seq( 
>> "org.scalanlp" %% "breeze" % "0.8.1", "org.scalanlp" %% "breeze-natives" % 
>> "0.8.1" ) resolvers ++= Seq( "Sonatype Snapshots" at 
>> "https://oss.sonatype.org/content/repositories/snapshots/";, "Sonatype 
>> Releases" at "https://oss.sonatype.org/content/repositories/releases/"; ) 
>> scalaVersion := "2.10.3" > End  > > >Is there anything I missed? > > 
>> >BR >Kitaev


Re: Intellij IDEA can not recognize the MLlib package

2014-08-03 Thread Sean Owen
You missed the mllib artifact? that would certainly explain it! all I
see is core.

On Sun, Aug 3, 2014 at 10:03 AM, jun  wrote:
> Hi,
>
>
> I have started my spark exploration in intellij IDEA local model and want to 
> focus on MLlib part.
> but when I put some example codes in IDEA, It can not recognize mllib 
> package, just loos like that:
>
>
>>
>> import org.apache.spark.SparkContext
>>import org.apache.spark.mllib.recommendation.ALS
>>
>
>
> I hava configured the breeze in build.sbt file and also install the mingw gcc 
> & gfortran lib. Here is my build.sbt:
>
>
> build.sbt 
> name := "SparkMLlibLocal" version := "1.0" resolvers += "Ooyala Bintray" at 
> "http://dl.bintray.com/ooyala/maven"; resolvers += "Akka Repository" at 
> "http://repo.akka.io/releases/"; libraryDependencies += "ooyala.cnd" % 
> "job-server" % "0.3.1" % "provided" libraryDependencies += 
> "com.github.fommil.netlib" % "all" % "1.1.2" libraryDependencies += 
> "org.apache.spark" %% "spark-core" % "1.0.0" libraryDependencies ++= Seq( 
> "org.scalanlp" %% "breeze" % "0.8.1", "org.scalanlp" %% "breeze-natives" % 
> "0.8.1" ) resolvers ++= Seq( "Sonatype Snapshots" at 
> "https://oss.sonatype.org/content/repositories/snapshots/";, "Sonatype 
> Releases" at "https://oss.sonatype.org/content/repositories/releases/"; ) 
> scalaVersion := "2.10.3"
> End 
>
>
> Is there anything I missed?
>
>
> BR
> Kitaev

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re:Intellij IDEA can not recognize the MLlib package

2014-08-03 Thread jun
Sorry the color is missing. the "mllib" is red word and "import" sentence is 
grey.>>import org.apache.spark.mllib.recommendation.ALS

At 2014-08-03 05:03:31, jun"  wrote: >Hi, > > >I have started 
my spark exploration in intellij IDEA local model and want to focus on MLlib 
part. >but when I put some example codes in IDEA, It can not recognize mllib 
package, just loos like that: > > >> >> import org.apache.spark.SparkContext 
>>import org.apache.spark.mllib.recommendation.ALS >> > > >I hava configured 
the breeze in build.sbt file and also install the mingw gcc & gfortran lib. 
Here is my build.sbt: > > > build.sbt  >name := "SparkMLlibLocal" 
version := "1.0" resolvers += "Ooyala Bintray" at 
"http://dl.bintray.com/ooyala/maven"; resolvers += "Akka Repository" at 
"http://repo.akka.io/releases/"; libraryDependencies += "ooyala.cnd" % 
"job-server" % "0.3.1" % "provided" libraryDependencies += 
"com.github.fommil.netlib" % "all" % "1.1.2" libraryDependencies += 
"org.apache.spark" %% "spark-core" % "1.0.0" libraryDependencies ++= Seq( 
"org.scalanlp" %% "breeze" % "0.8.1", "org.scalanlp" %% "breeze-natives" % 
"0.8.1" ) resolvers ++= Seq( "Sonatype Snapshots" at 
"https://oss.sonatype.org/content/repositories/snapshots/";, "Sonatype Releases" 
at "https://oss.sonatype.org/content/repositories/releases/"; ) scalaVersion := 
"2.10.3" > End  > > >Is there anything I missed? > > >BR >Kitaev

Intellij IDEA can not recognize the MLlib package

2014-08-03 Thread jun
Hi,


I have started my spark exploration in intellij IDEA local model and want to 
focus on MLlib part.
but when I put some example codes in IDEA, It can not recognize mllib package, 
just loos like that:


>
> import org.apache.spark.SparkContext
>import org.apache.spark.mllib.recommendation.ALS
>


I hava configured the breeze in build.sbt file and also install the mingw gcc & 
gfortran lib. Here is my build.sbt:


 build.sbt 
name := "SparkMLlibLocal" version := "1.0" resolvers += "Ooyala Bintray" at 
"http://dl.bintray.com/ooyala/maven"; resolvers += "Akka Repository" at 
"http://repo.akka.io/releases/"; libraryDependencies += "ooyala.cnd" % 
"job-server" % "0.3.1" % "provided" libraryDependencies += 
"com.github.fommil.netlib" % "all" % "1.1.2" libraryDependencies += 
"org.apache.spark" %% "spark-core" % "1.0.0" libraryDependencies ++= Seq( 
"org.scalanlp" %% "breeze" % "0.8.1", "org.scalanlp" %% "breeze-natives" % 
"0.8.1" ) resolvers ++= Seq( "Sonatype Snapshots" at 
"https://oss.sonatype.org/content/repositories/snapshots/";, "Sonatype Releases" 
at "https://oss.sonatype.org/content/repositories/releases/"; ) scalaVersion := 
"2.10.3"
 End 


Is there anything I missed?


BR
Kitaev