andrewpalumbo commented on a change in pull request #394:
URL: https://github.com/apache/mahout/pull/394#discussion_r414073687
##########
File path:
community/community-engines/flink-batch/src/main/scala/org/apache/mahout/flinkbindings/drm/CheckpointedFlinkDrm.scala
##########
@@ -62,17 +62,16 @@ class CheckpointedFlinkDrm[K: ClassTag:TypeInformation](val
ds: DrmDataSet[K],
// this is extra I/O for each cache call. this needs to be moved somewhere
where it is called
// only once. Possibly FlinkDistributedEngine.
- GlobalConfiguration.loadConfiguration(mahoutHome + "/conf/flink-config.yaml")
+ GlobalConfiguration.loadConfiguration(mahoutHome + "/conf/")
Review comment:
getting a failure here: with `mvn clean package install`. We've moved
The `Flink` Module to community so that any members of the community can make
use of it, see what we ran into, etc. Our main issue was with flink's greedy
execution of code re-written by the optpimizer, and Overflowing Memory
mid-expression which should not have been necessary:
E.g.
` X:= AX.tX * B.tB - c`
when put through the optimizer will come out as
` A:= SelfSq(X)AB.tB -c`
Something similar was happening, i cant remember exactly, but the second
`B.tB` woule be expanded into memory.. I may be way off, It was the same for
iterative algorithms though, the DAG would compute at each iteration for
certain expressions, which were meant to be checkpoint and cached. These
Checpoints would Eagerly evaluate and crash Iterative Algos like DSSVD, and
DSPCA.
I', unsure of the current state of Flink Batch, but
is our simple explanatio
```org.apache.mahout.flinkbindings.DrmLikeOpsSuite *** ABORTED ***
org.apache.flink.configuration.IllegalConfigurationException: The Flink
config file '/Users/colleenpalumbo/sandbox/mahout/conf/flink-conf.yaml'
(/Users/colleenpalumbo/sandbox/mahout/conf) does not exist.
at
org.apache.flink.configuration.GlobalConfiguration.loadConfiguration(GlobalConfiguration.java:124)
at
org.apache.flink.configuration.GlobalConfiguration.loadConfiguration(GlobalCon```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]