Re: [SMW-devel] The future of the SMW query stores

2013-10-16 Thread Yury Katkov
I suppose that Mongo have been chosen because of its scalability, right?
-
Yury Katkov, WikiVote



On Wed, Oct 16, 2013 at 5:20 AM, david mason
vid_semediawiki-de...@zooid.org wrote:

 May I suggest that ElasticSearch is considered instead of MongoDB.

 ElasticSearch is the index engine of the new MediaWiki Search, so end users
 won't need to set up and support multiple data stores. Like MongoDB it is a
 document store that natively uses JSON, and is really easy to set up and run
 (a .deb is available). It's super easy to work with and since it's based on
 Lucene incredibly powerful for many operations. I've used both Mongo and ES
 and definitely prefer the latter.

 They each have their strengths, MongoDB is more of a key value store, ES is
 more of a search server (though I'd assert it could do the KV stuff adding
 very useful search operations and no additional infrastructure if using MW
 search), in either case this seems like it would be a big win in terms of
 better structured, more accessible data!

 David


 On 15 October 2013 20:08, Jeroen De Dauw jeroended...@gmail.com wrote:

 Hey,

 The last release introduced SQLStore3, a partial rewrite of SQLStore2,
 improving on the performance of its predecessor. That is not the end of the
 story for the SMW query stores though.

 This email was prompted by work MWJames is doing in supporting MongoDB \o/
 https://gerrit.wikimedia.org/r/#/c/88534/

 For a while now, there have been two items on our Roadmap about utilizing
 new libraries I created for the Wikidata project, that are both based on,
 and usable by, SMW components.

 *
 https://semantic-mediawiki.org/wiki/Roadmap#Make_use_of_DataValues_library
 * https://semantic-mediawiki.org/wiki/Roadmap#Make_use_of_Ask_library

 There now is a third such component, which might enable us to get a nice
 improvement to our SQLStore without all to much effort. I described this
 here:

 https://semantic-mediawiki.org/wiki/Wikibase_QueryEngine

 This is still quite far off, assuming no one else jumps on it, given that
 it requires the earlier two items to be finished first. Feedback on the idea
 is however welcome. And awareness of these preliminary plans, or rather
 possibilities (I'm not committed to doing this at this point), is good for
 those doing or planning to do something related to the SMW storage
 infrastructure.

 Cheers

 --
 Jeroen De Dauw
 http://www.bn2vs.com
 Don't panic. Don't be evil. ~=[,,_,,]:3
 --


 --
 October Webinars: Code for Performance
 Free Intel webinars can help you accelerate application performance.
 Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most
 from
 the latest Intel processors and coprocessors. See abstracts and register 

 http://pubads.g.doubleclick.net/gampad/clk?id=60135031iu=/4140/ostg.clktrk
 ___
 Semediawiki-devel mailing list
 Semediawiki-devel@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/semediawiki-devel



 --
 October Webinars: Code for Performance
 Free Intel webinars can help you accelerate application performance.
 Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most
 from
 the latest Intel processors and coprocessors. See abstracts and register 
 http://pubads.g.doubleclick.net/gampad/clk?id=60135031iu=/4140/ostg.clktrk
 ___
 Semediawiki-devel mailing list
 Semediawiki-devel@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/semediawiki-devel


--
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register 
http://pubads.g.doubleclick.net/gampad/clk?id=60135031iu=/4140/ostg.clktrk
___
Semediawiki-devel mailing list
Semediawiki-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/semediawiki-devel


Re: [SMW-devel] how to know the validation option for the parameters in SMWResultPrinter::getParamDefinitions( )?

2013-10-16 Thread Yury Katkov
Hi Jeroen! Can you recommend any result formats that use Validator in
the right and intense way?
-
Yury Katkov, WikiVote



On Sat, Oct 12, 2013 at 4:56 AM, Jeroen De Dauw jeroended...@gmail.com wrote:
 Hey Yury,

 This is unfortunately not properly documented, and as with a lot of MW code,
 the answer is, at least for now, look at the source. This is certainly not
 a good answer and something that should be addressed. Rather then answering
 in detail here, I'll be adding documentation incrementally to the README
 file of Validator with the aim of having the basics covered there by the
 time of its 1.0 release, which will be shortly before the SMW 1.9 one.

 The README can be seen here:
 https://github.com/wikimedia/mediawiki-extensions-Validator/blob/master/README.md

 Some quick replies:


 What values can I use for the values of the key 'default'?

 Anything. This is self defined.


 Where did these words 'message', 'values', 'default', 'aliases' come from?

 This is the definition format in array form as defined by Validator.

 Are there any other interesting words to use (for example 'mandatory'
 would be nice or 'type')?

 There is a type field. A param is mandatory if it does not have a default.

 Cheers

 --
 Jeroen De Dauw
 http://www.bn2vs.com
 Don't panic. Don't be evil. ~=[,,_,,]:3
 --

--
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register 
http://pubads.g.doubleclick.net/gampad/clk?id=60135031iu=/4140/ostg.clktrk
___
Semediawiki-devel mailing list
Semediawiki-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/semediawiki-devel


Re: [SMW-devel] The future of the SMW query stores

2013-10-16 Thread david mason
With regard to ES and data recovery/transactions, if SMW continues to be
able to generate this data at any time it doesn't seem to be much of an
issue. ES is also horizontally scalable as one of its main features, and
supports geo features and advanced search, although graph traversal is
manual and commits are near-real-time.

I am mainly proposing this for the simplicity of the operators.  Asking
them to set up, for one SMW instance, MW, MySQL, SMW, ES for MW search at
least, and one or two additional stores seems like a lot.

I would guess that there are three kinds of SMW users; 1. those happy using
it as a flexible self-contained front end built on MW for forms and pages,
2. those who would like to use it for Semantic Web / LOD type purposes
(formal ontology design, enforcement, inference, and shared data between
sites using web standards), and 3. those who would at least like a solid
option/path to 2.

For the many members of the community who would benefit from a real focus
on an RDF store and schema support, I would clearly support something like
Richard's stack, but it might add a lot of complexity to hosting and
development. Probably many SMW users now are using inexpensive hosting
plans which wouldn't support this broader stack, and as I understand it the
current SMW PHP API is not cleanly designed up so it may basically be a
reinvention (which could be a good thing but would be disruptive).

For myself I work in a mix of applications and am in solidly in camp 3 as a
way forward, fwiw.

And I can't help but wonder how WikiData fits into the mix. (=

David





On 16 October 2013 09:48, Richard Banks richard.bank...@gmail.com wrote:

 Hi,

 Just to add to the conversation, I would also recommend ElasticSearch as a
 great solution for the search side of things. There are also cases of
 people using it as the sole data store. However, I believe caution should
 be taken against such an approach since ES currently doesn't provide much
 in the way of data recovery or transactions.

 For this reason, ES is typically deployed in combination with a data
 storage technology that does support these factors, such as Mongo. ES
 allows you to define what's known as rivers, and these pull data out of a
 configured data source and into the index, thus providing the benefits of
 its powerful search (which is literally insane).

 In terms of making use of the rich inherent graph structure of the data at
 the higher level, a GraphDB would make sense as suggested by Joel. One
 GraphDB that might be worth a look is Titan, which has been developed by
 the Tinkerpop guys I believe. Its a distributed graph database which also
 (interestingly) supports ElasticSearch. It also abstracts over many data
 stores/formats (including RDF) out-of-the-box. ES is a clever move IMO
 because one of the challenges in graph search is jumping into the graph in
 the first place, and it looks like they use the ES index to do this.

 So, you could almost just use Titan for search, get all the benefits of
 graph traversals etc., and have it manage your ES index too.

 Regards,
 Richard


 --
 October Webinars: Code for Performance
 Free Intel webinars can help you accelerate application performance.
 Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most
 from
 the latest Intel processors and coprocessors. See abstracts and register 
 http://pubads.g.doubleclick.net/gampad/clk?id=60135031iu=/4140/ostg.clktrk
 ___
 Semediawiki-devel mailing list
 Semediawiki-devel@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/semediawiki-devel


--
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register 
http://pubads.g.doubleclick.net/gampad/clk?id=60135031iu=/4140/ostg.clktrk___
Semediawiki-devel mailing list
Semediawiki-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/semediawiki-devel


Re: [SMW-devel] The future of the SMW query stores

2013-10-16 Thread Mark A. Hershberger
Could someone familiar with SMW help develop a role for
MediaWiki-Vagrant?  Especially helpful would be a way to deploy MongoDB
and the like in one go.

Mark.

--
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register 
http://pubads.g.doubleclick.net/gampad/clk?id=60135031iu=/4140/ostg.clktrk
___
Semediawiki-devel mailing list
Semediawiki-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/semediawiki-devel


Re: [SMW-devel] how to know the validation option for the parameters in SMWResultPrinter::getParamDefinitions( )?

2013-10-16 Thread Yury Katkov
During the debugging I saw a lot of interesting fields in SMWParamFormat:
* toLower
* trimValue
* applyManipulationsToDefault
* dependencies
* default
* validationFunction
* type

I'm particularly interested in how to use validationFunction: can I
assign my validation callback like that? I tried but nothing have
happened.

public function getParamDefinitions( array $definitions ) {
 $params = parent::getParamDefinitions( $definitions );

  $params['do awesome'] = array(
 validationFunction = MyClass::isDoAwesomeValid
);
}

It seems that I can do the following for the same effect, is that the only way?

public function getParamDefinitions( array $definitions ) {
  $startpr = array(
 'name' = 'do awesome',
 'message' = 'srf_paramdesc_doawesome',
 'default' = '',
  );

  $params['do awesome']=
  ParamDefinitionFactory::singleton()-newDefinitionFromArray( $startpr
  );
  $params['do awesome']-setValidationCallback(ChapTimeline::isDoAwesomeValid);
}

-
Yury Katkov, WikiVote



On Wed, Oct 16, 2013 at 4:19 PM, Yury Katkov katkov.ju...@gmail.com wrote:
 Hi Jeroen! Can you recommend any result formats that use Validator in
 the right and intense way?
 -
 Yury Katkov, WikiVote



 On Sat, Oct 12, 2013 at 4:56 AM, Jeroen De Dauw jeroended...@gmail.com 
 wrote:
 Hey Yury,

 This is unfortunately not properly documented, and as with a lot of MW code,
 the answer is, at least for now, look at the source. This is certainly not
 a good answer and something that should be addressed. Rather then answering
 in detail here, I'll be adding documentation incrementally to the README
 file of Validator with the aim of having the basics covered there by the
 time of its 1.0 release, which will be shortly before the SMW 1.9 one.

 The README can be seen here:
 https://github.com/wikimedia/mediawiki-extensions-Validator/blob/master/README.md

 Some quick replies:


 What values can I use for the values of the key 'default'?

 Anything. This is self defined.


 Where did these words 'message', 'values', 'default', 'aliases' come from?

 This is the definition format in array form as defined by Validator.

 Are there any other interesting words to use (for example 'mandatory'
 would be nice or 'type')?

 There is a type field. A param is mandatory if it does not have a default.

 Cheers

 --
 Jeroen De Dauw
 http://www.bn2vs.com
 Don't panic. Don't be evil. ~=[,,_,,]:3
 --

--
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register 
http://pubads.g.doubleclick.net/gampad/clk?id=60135031iu=/4140/ostg.clktrk
___
Semediawiki-devel mailing list
Semediawiki-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/semediawiki-devel