Thanks Ted,
that helped me, it turned out that I wrongly formated the name of the
server, I had to add spark:// in front of server name.
Cheers,
Andrejs
On 11/11/15 14:26, Ted Yu wrote:
Please take a look
at launcher/src/test/java/org/apache/spark/launcher/SparkLauncherSuite.java
to see how
op2.6")
.setAppResource("/home/user/MyCode/forSpark/wordcount.py").addPyFile("/home/andabe/MyCode/forSpark/wordcount.py")
.setMaster("myServerName")
.setAppName("pytho2word")
.launch();
println("finishing")
spark.waitFor();
println("finished")
Any help is appreciated.
Cheers,
Andrejs
Thank you for the information.
Cheers,
Andrejs
On 04/18/2015 10:23 AM, Nick Pentreath wrote:
ES-hadoop uses a scan scroll search to efficiently retrieve large
result sets. Scores are not tracked in a scan and sorting is not
supported hence 0 scores.
http://www.elastic.co/guide/en
- Map(_index - dbpedia, _type - docs, _id -
AUy5aQs7895C6HE5GmG4, _score - 0.0))
As you can see _score is 0.
Would appreciate any help,
Cheers,
Andrejs
Hi,
Can some one pleas sugest me, what is the best way to output spark data as
JSON file. (File where each line is a JSON object)
Cheers,
Andrejs
I found my problem. I assumed based on TF-IDF in Wikipedia , that log base
10 is used, but as I found in this discussion
https://groups.google.com/forum/#!topic/scala-language/K5tbYSYqQc8, in
scala it is actually ln (natural logarithm).
Regards,
Andrejs
On Thu, Oct 30, 2014 at 10:49 PM, Ashic
Hi,
I'm new to Mllib and spark. I'm trying to use tf-idf and use those values
for term ranking.
I'm getting tf values in vector format, but how can get the values of
vector?
val sc = new SparkContext(conf)
val documents: RDD[Seq[String]] =
sc.textFile(/home/andrejs/Datasets/dbpedia
Best regards,
Andrejs