dweiss commented on a change in pull request #759: URL: https://github.com/apache/lucene/pull/759#discussion_r832917684
########## File path: gradle/datasets/external-datasets.gradle ########## @@ -120,10 +120,9 @@ configure(project(":lucene:benchmark")) { task getReuters(type: Download) { ext { name = "reuters21578" - // note: there is no HTTPS url and we don't care because this is merely test/perf data - src = "http://www.daviddlewis.com/resources/testcollections/reuters21578/${name}.tar.gz" + src = "https://kdd.ics.uci.edu/databases/${name}/${name}.tar.gz" intermediate = file("${dataDir}/${name}.tar.gz") - dst = file("${dataDir}/${name}") Review comment: I'd keep the reutrers21578 name here (not reuters-out). ########## File path: lucene/benchmark/conf/micro-standard.alg ########## @@ -30,8 +30,7 @@ doc.tokenized=true doc.term.vector=false log.step=500 -work.dir=data -docs.dir=reuters21578 +docs.dir=reuters-out Review comment: Same here. ########## File path: lucene/benchmark/conf/readContentSource.alg ########## @@ -24,7 +24,7 @@ # # To use this, first cd to benchmark and then run: # -# ant run-task -Dtask.alg=conf/readContentSource.alg +# gradlew run -P task.alg=conf/readContentSource.alg Review comment: I don't think it'll work if you run gradlew from the top-level of the project and give a module-relative path to -Ptask.alg (haven't checked). If task.alg is resolved relative to the module then maybe it should be resolved relative to conf (not to give a false impression that it's a "path" relative to cwd)? The second thing is that the run task may be present in other modules besides benchmarks; I would give an explicit task path here: gradlew :lucene:benchmarks:run -Ptask.alg=... so that you can't accidentally start Luke or something else, accidently. ########## File path: lucene/benchmark/conf/tokenize.alg ########## @@ -22,7 +22,7 @@ # # To use this, cd to benchmark and then run: # -# ant run-task -Dtask.alg=conf/tokenize.alg +# gradlew run -P task.alg=conf/tokenize.alg Review comment: This white space between -P and the property is somehow disturbing. Even if it works, it's really not following the convention syntax of how properties are passed to gradle [1]. https://docs.gradle.org/current/userguide/command_line_interface.html#sec:environment_options -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org