Sure, but at least some would be Scala. There are examples in Mahout that take PairRDDs as input but anything that constructs an IndexedDataset would be fine. I use this code in a system that creates an RDD from HBase. Think of the task as one of how to create a Spark RDD from your DB content.
On May 3, 2016, at 4:32 AM, Rohit Jain <rohitkjai...@gmail.com> wrote: Hello Everyone, I have products and there are certain associated tags to each product. So to find similar products I am using mahout spark-rowsimilarity algorithm in following manner. $MAHOUT_HOME/mahout spark-rowsimilarity -i hdfs://0.0.0.0:9000/wtrousers -o hdfs://0.0.0.0:9000/s_trousers_out1/ -D:spark.io.compression.=lzf -ma spark://0.0.0.0:7077 To run this command I need to pull data from database to flat file. Is there anyway I can use this command / write java code directly to work on database? -- Thanks & Regards, *Rohit Jain* Web developer | Consultant Mob +91 8097283931