spark ML PR committers

2018-11-05 Thread Ilya Matiach
Hi spark devs, Who are the main committers/reviewers for the [ML] and [MLLib] namespace currently? Could someone please review & merge this simple PR which I reviewed: https://github.com/apache/spark/pull/22087 I also have two PRs for review, one of which was approved by one reviewer back in May

barrier execution mode with DataFrame and dynamic allocation

2018-12-19 Thread Ilya Matiach
[Note: I sent this earlier but it looks like the email was blocked because I had another email group on the CC line] Hi Spark Dev, I would like to use the new barrier execution mode introduced in spark 2.4

RE: barrier execution mode with DataFrame and dynamic allocation

2018-12-28 Thread Ilya Matiach
ark dev email list does by default to email senders, I’ve seen it added to other emails on the mailing list before. Thank you and Happy Holidays, Ilya From: Xiangrui Meng Sent: Wednesday, December 19, 2018 12:16 PM To: Ilya Matiach Cc: dev@spark.apache.org Subject: Re: barrier execution mode wi

RE: How to implement model versions in MLlib?

2019-01-16 Thread Ilya Matiach
Hi Sean and Jatin, Could you point to some examples of load() methods that use the spark version vs the model version (or the columns available)? I see only cases where we use the spark version (eg https://github.com/apache/spark/blob/c04ad17ccf14a07ffdb2bf637124492a341075f2/mllib/src/main/scala/

RE: Detect executor core count

2019-06-18 Thread Ilya Matiach
Hi Andrew, I tried to do something similar to that in the LightGBM classifier/regressor/ranker in mmlspark package, I try to use the spark conf and if not configured I get the processors from the JVM directly: https://github.com/Azure/mmlspark/blob/master/src/lightgbm/src/main/scala/LightGBMUtils

RE: read image or binary files / spark 2.3

2019-09-05 Thread Ilya Matiach
Hi Peter, You can use the spark.readImages API in spark 2.3 for reading images: https://databricks.com/blog/2018/12/10/introducing-built-in-image-data-source-in-apache-spark-2-4.html https://blogs.technet.microsoft.com/machinelearning/2018/03/05/image-data-support-in-apache-spark/ https://spark.a

RE: Spark Tasks Progress

2019-09-23 Thread Ilya Matiach
@Sultan Alamro great question. I had a similar scenario, where workers needed to aggregate host:port information for initializing an MPI ring, and I used direct socket communication between the workers and driver. This is where the driver accepts sockets from wo

mllib metrics vs ml evaluators and how to improve apis for users

2016-12-29 Thread Ilya Matiach
Hi ML/MLLib developers, 1.I'm trying to add a weights column to ml spark evaluators (RegressionEvaluator, BinaryClassificationEvaluator, MutliclassClassificationEvaluator) that use mllib metrics and I have a few questions (JIRA 2.SPARK-18693

RE: Feedback on MLlib roadmap process proposal

2017-01-24 Thread Ilya Matiach
Just a few questions with regards to the MLLIB process: 1. Is there a list of committers who can/are shepherds and what code they own? I’ve seen this page: http://spark.apache.org/committers.html but I’m not sure if it is up to date and it doesn’t mention what code the committers own. It

RE: Feedback on MLlib roadmap process proposal

2017-01-24 Thread Ilya Matiach
wen [mailto:so...@cloudera.com] Sent: Tuesday, January 24, 2017 11:23 AM To: Ilya Matiach Cc: dev@spark.apache.org Subject: Re: Feedback on MLlib roadmap process proposal On Tue, Jan 24, 2017 at 3:58 PM Ilya Matiach mailto:il...@microsoft.com>> wrote: Just a few questions with regards to the