Re: [SMW-devel] Making subobjects correctly ordered
On 20 June 2013 20:32, Yaron Koren ya...@wikiworks.com wrote: As for whether storing the index, whether it's part of the subobject name or in a Has index property, changes the data model - I don't think it does. I think that could work as long as you do not change the identity of the subobject. I.e. these additional properties should not be stored in the hash directly nor used to produce the hash. You can simply ignore the index value; I would just think of it as additional data that can either be used or not. If you didn't like my Modification date example, how about this: the subobject hash itself is some additional, SMW-only data that gets added to each row, but that doesn't affect the data model either. -- This SF.net email is sponsored by Windows: Build for Windows Store. http://p.sf.net/sfu/windows-dev2dev ___ Semediawiki-devel mailing list Semediawiki-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/semediawiki-devel
Re: [SMW-devel] Making subobjects correctly ordered
Hi, Okay - so there's at least some consensus for keeping the subobject hash as it is. That means that, for subobjects to be sortable by entry order, I think there would have to be a separate special property, called Has index or Has number or something, that would store the index of each subobject on the page. However, there are a few weaknesses to that approach that I can think of: - Creating a new special property takes some development work - including, possibly, creating a new database table? - Subobjects won't be sorted automatically; whoever creates each query will have to remember to add sort=Has index to the query. - Because of the way sort= works (it removes all items that don't have that property), it will cause some confusion for wikis starting to use the new code: subobjects that don't have Has index set yet will simply not show up in sorted queries. For these reasons, I'm leaning heavily toward just changing Semantic Internal Objects to use hashes like #001, #002, etc. again, like it used to do. That way, users will have a choice of naming schemes. And it somewhat fits in with the different philosophies of SMW/subobjects vs. SIO: in SMW, such objects can have their own name and identity, while in SIO, they're really only attributes of the page on which they're defined. -Yaron On Fri, Jun 21, 2013 at 4:31 AM, Stephan Gambke s7ep...@gmail.com wrote: On 20 June 2013 20:32, Yaron Koren ya...@wikiworks.com wrote: As for whether storing the index, whether it's part of the subobject name or in a Has index property, changes the data model - I don't think it does. I think that could work as long as you do not change the identity of the subobject. I.e. these additional properties should not be stored in the hash directly nor used to produce the hash. You can simply ignore the index value; I would just think of it as additional data that can either be used or not. If you didn't like my Modification date example, how about this: the subobject hash itself is some additional, SMW-only data that gets added to each row, but that doesn't affect the data model either. -- WikiWorks · MediaWiki Consulting · http://wikiworks.com -- This SF.net email is sponsored by Windows: Build for Windows Store. http://p.sf.net/sfu/windows-dev2dev___ Semediawiki-devel mailing list Semediawiki-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/semediawiki-devel
Re: [SMW-devel] Making subobjects correctly ordered
Hi, Yaron! I think subobjects sorting is good task, but i suggest not to use subobjects name for this because of big problem with that: imagine we have 3 subobjects on page: Page name#001_4bd1f1b74a76de5322dd74956a71f089 Page name#002_03163dfd1d2502668b00c1f521688984 Page name#003_02dwa3j349j8d3jds3843234jd8349490 now, we edit page, delete subobject 002. What should happen? Should other subobjects be renamed to keep sorting? What if they already linked from other pages/queries? I think better way is to automaticaly attach some semantic property (Sort for example) to every subobject on page. This property should contain subobjects number on page. -- View this message in context: http://wikimedia.7.x6.nabble.com/Making-subobjects-correctly-ordered-tp5007553p5007558.html Sent from the Semantic Mediawiki - Development mailing list archive at Nabble.com. -- This SF.net email is sponsored by Windows: Build for Windows Store. http://p.sf.net/sfu/windows-dev2dev ___ Semediawiki-devel mailing list Semediawiki-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/semediawiki-devel
Re: [SMW-devel] Making subobjects correctly ordered
Hi Alexey, Yes, that's a good point - I actually thought about an approach like that, but forgot to include it in the email. A property called Sort (a name like Has index might be a little clearer) would solve this problem - and it would be a more semantic solution. On the other hand, it would add to the proliferation of special properties (for what that's worth), and it would mean a little more work for administrators to get queries of subobjects ordered correctly. I still think my original proposed solution would work fine, though I confess I don't quite understand how the subobject hashing works. Are people supposed to be able to directly link to or reference a subobject, using the hash? I don't see how that could work, given that everything about a subobject could change from one page save to the next - its order, its properties, etc. I don't see how the system could keep consistency of subobject naming. -Yaron On Thu, Jun 20, 2013 at 10:54 AM, Alexey Klimovich god.vedm...@gmail.comwrote: Hi, Yaron! I think subobjects sorting is good task, but i suggest not to use subobjects name for this because of big problem with that: imagine we have 3 subobjects on page: Page name#001_4bd1f1b74a76de5322dd74956a71f089 Page name#002_03163dfd1d2502668b00c1f521688984 Page name#003_02dwa3j349j8d3jds3843234jd8349490 now, we edit page, delete subobject 002. What should happen? Should other subobjects be renamed to keep sorting? What if they already linked from other pages/queries? I think better way is to automaticaly attach some semantic property (Sort for example) to every subobject on page. This property should contain subobjects number on page. -- View this message in context: http://wikimedia.7.x6.nabble.com/Making-subobjects-correctly-ordered-tp5007553p5007558.html Sent from the Semantic Mediawiki - Development mailing list archive at Nabble.com. -- This SF.net email is sponsored by Windows: Build for Windows Store. http://p.sf.net/sfu/windows-dev2dev ___ Semediawiki-devel mailing list Semediawiki-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/semediawiki-devel -- WikiWorks · MediaWiki Consulting · http://wikiworks.com -- This SF.net email is sponsored by Windows: Build for Windows Store. http://p.sf.net/sfu/windows-dev2dev___ Semediawiki-devel mailing list Semediawiki-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/semediawiki-devel
Re: [SMW-devel] Making subobjects correctly ordered
Hi, Alexey - as Stephan notes, the hash is based on the set of parameters to #subobject, which means that if you even change one value the hash will (I think) change. Which means that linking/referencing a specific subobject based on its hash is not that good of a long-term strategy, I don't think. Though I find it strange to do that kind of thing in the first place, anyway - if something is going to be linked to/discussed directly, as opposed to just aggregated, I would think it should have a true name. But I don't know what the data you're talking about is. Jamie - your usage sounds interesting, but I think not relevant to this discussion, since you don't need to sort based on entry order. Stephan - you make some good points. As far as displaying the number/index in queries - that sounds interesting, though even a separate Has index property might not necessarily be ideal for that. If you have two or more different kinds of #subobject calls on a page, it might not work out nicely. For instance, a recipe page might have subobjects for ingredients, and then subobjects for instructions. In that case, the ingredients might have Has index values of 1-10, and then the instructions might have Has index values of 11-15. (That's how numbering used to work in SIO.) So displaying this property might just look weird. Your second point is that the hash system lets SMW only display unique subobjects. Which is true, but (a) in my experience that's not a major issue, and (b) actually, sometimes you really do want to store duplicate data. What if you have a page of test scores, and you use #subobject to store each student's name and their score, and two students happen to have the same name and the same score? (Let's say that there aren't wiki pages for each student, which would force you to disambiguate.) Duplicate data might not be an error - it might be valid data. You seem to also make the point that, because it hasn't been entered by a user, the index/number isn't truly a part of the data model, and shouldn't be stored at all. Assuming you are in fact making that point, it's a reasonable opinion, but I disagree, for two reasons: (1) the order of elements is something that users can control, and thus, it is actually implicitly part of the data model, and (2) SMW already stores a bunch of stuff that's even less a part of the data model: the Modification date property, etc. You could say that two wrongs don't make a right (to use an expression), but at the very least, this wouldn't be breaking anything that's not already broken. Again, though, I'm not sure if that's what you were getting at. -Yaron On Thu, Jun 20, 2013 at 12:37 PM, Stephan Gambke s7ep...@gmail.com wrote: Hi Yaron, I do not think that your approach will work. At a first glance it seems to be an easy way out to provide sorting. But from a software engineering point of view it loads the identifier with information that just does not belong there. From a practical point of view it falls short if anybody wants to query that number. And finally from a semantic point of view it inseparably mixes two statements (the original one and the one about the sequence number) that the originator usually does not want to be mixed. This last problem btw is also the key to your question about the determination of the hash key. To state the same thing twice is just that: A duplicate statement. As opposed to two statements. To my best knowledge SMW will not store such a statement twice. Instead it will generate the hash key based on the property and value and if that hash already exists, then the statement it represents is considered already known and the second occurrence will be dropped and not appear in any query results. I am not sure if this is also true for subjects, but it really should be. So, long story short: If your data model for project management does not explicitly contain the sequence number for the activities, then your model is incomplete, not SMW. In fact, should two activities be exactly the same, you will probably lose one of them. Cheers, Stephan On Jun 20, 2013 5:30 PM, Yaron Koren ya...@wikiworks.com wrote: Hi Alexey, Yes, that's a good point - I actually thought about an approach like that, but forgot to include it in the email. A property called Sort (a name like Has index might be a little clearer) would solve this problem - and it would be a more semantic solution. On the other hand, it would add to the proliferation of special properties (for what that's worth), and it would mean a little more work for administrators to get queries of subobjects ordered correctly. I still think my original proposed solution would work fine, though I confess I don't quite understand how the subobject hashing works. Are people supposed to be able to directly link to or reference a subobject, using the hash? I don't see how that could work, given that everything about a subobject could change from one page save
Re: [SMW-devel] Making subobjects correctly ordered
Hi Yaron, I did not propose a general 'has index' property. In fact, I would strongly advise against it. Your recipe example is a good one for a case where an index does not make sense and implying one would be wrong. For the students example: If your data model identifies students by their name alone, then again the data model is insufficient, not SMW. Basically your statement is 'John has a score of 2'. If you repeat that statement, then a natural person will tell you that you already said that. SMW will drop the second statement. If you actually want both of these statements stored you better think of a way to disambiguate. On the point of the index number being or not being a part of the data model: That has nothing to do at all with wether it comes from a user input or not. You should first build you data model. I do not say that it should not contain index numbers. If you need them, by all means include them. But do so explicitly. Don't just include them in all data model just because they are useful in some cases. Then, when you have your data model, think about how to use it. E.g. how to assign those index numbers. They can for sure come from a user input. They might as well be assigned somehow automatically. I don't care. The order of the elements may be controllable by the user. But deriving the order of the elements from that in the model is wrong. When the user says he needs eggs, milk and flour then you should not translate that into 'first eggs, second milk and finally flour'. The correct translation would be 'eggs, milk and sugar and by the way he ordered eggs first, milk second and sugar third'. This means you will end up with six statements - three on the ingredients and three on the statements on the ingredients. Do not mix them. Regarding Modification date: There is quite a difference. The modification date is a statement on a subject (the wiki page) that is stored with the subject, but without modifying it. Storing the index number of a statement the way you propose, _would_ modify the statement. So, nothing broken with the Modification date, with that index, though... Cheers, Stephan On Jun 20, 2013 7:02 PM, Yaron Koren ya...@wikiworks.com wrote: Stephan - you make some good points. As far as displaying the number/index in queries - that sounds interesting, though even a separate Has index property might not necessarily be ideal for that. If you have two or more different kinds of #subobject calls on a page, it might not work out nicely. For instance, a recipe page might have subobjects for ingredients, and then subobjects for instructions. In that case, the ingredients might have Has index values of 1-10, and then the instructions might have Has index values of 11-15. (That's how numbering used to work in SIO.) So displaying this property might just look weird. Your second point is that the hash system lets SMW only display unique subobjects. Which is true, but (a) in my experience that's not a major issue, and (b) actually, sometimes you really do want to store duplicate data. What if you have a page of test scores, and you use #subobject to store each student's name and their score, and two students happen to have the same name and the same score? (Let's say that there aren't wiki pages for each student, which would force you to disambiguate.) Duplicate data might not be an error - it might be valid data. You seem to also make the point that, because it hasn't been entered by a user, the index/number isn't truly a part of the data model, and shouldn't be stored at all. Assuming you are in fact making that point, it's a reasonable opinion, but I disagree, for two reasons: (1) the order of elements is something that users can control, and thus, it is actually implicitly part of the data model, and (2) SMW already stores a bunch of stuff that's even less a part of the data model: the Modification date property, etc. You could say that two wrongs don't make a right (to use an expression), but at the very least, this wouldn't be breaking anything that's not already broken. Again, though, I'm not sure if that's what you were getting at. -Yaron On Thu, Jun 20, 2013 at 12:37 PM, Stephan Gambke s7ep...@gmail.com wrote: Hi Yaron, I do not think that your approach will work. At a first glance it seems to be an easy way out to provide sorting. But from a software engineering point of view it loads the identifier with information that just does not belong there. From a practical point of view it falls short if anybody wants to query that number. And finally from a semantic point of view it inseparably mixes two statements (the original one and the one about the sequence number) that the originator usually does not want to be mixed. This last problem btw is also the key to your question about the determination of the hash key. To state the same thing twice is just that: A duplicate statement. As opposed to two statements. To my best knowledge SMW will not store such a
Re: [SMW-devel] Making subobjects correctly ordered
Hi Stephan, For the duplicate data thing: you're right that duplicate data is never a good idea; I was wrong about that. Still, in practical terms I really don't think it matters whether the duplicate data gets stored or not. Duplicate data just doesn't happen very much, if at all; and even when it happens, that's an error in data entry: SMW doesn't have any obligation to fix such errors. (I think all of this is somewhat incidental, actually, because however the index gets stored, there should always be a way to prevent duplicates using the hash.) As for whether storing the index, whether it's part of the subobject name or in a Has index property, changes the data model - I don't think it does. You can simply ignore the index value; I would just think of it as additional data that can either be used or not. If you didn't like my Modification date example, how about this: the subobject hash itself is some additional, SMW-only data that gets added to each row, but that doesn't affect the data model either. -Yaron On Thu, Jun 20, 2013 at 2:04 PM, Stephan Gambke s7ep...@gmail.com wrote: Hi Yaron, I did not propose a general 'has index' property. In fact, I would strongly advise against it. Your recipe example is a good one for a case where an index does not make sense and implying one would be wrong. For the students example: If your data model identifies students by their name alone, then again the data model is insufficient, not SMW. Basically your statement is 'John has a score of 2'. If you repeat that statement, then a natural person will tell you that you already said that. SMW will drop the second statement. If you actually want both of these statements stored you better think of a way to disambiguate. On the point of the index number being or not being a part of the data model: That has nothing to do at all with wether it comes from a user input or not. You should first build you data model. I do not say that it should not contain index numbers. If you need them, by all means include them. But do so explicitly. Don't just include them in all data model just because they are useful in some cases. Then, when you have your data model, think about how to use it. E.g. how to assign those index numbers. They can for sure come from a user input. They might as well be assigned somehow automatically. I don't care. The order of the elements may be controllable by the user. But deriving the order of the elements from that in the model is wrong. When the user says he needs eggs, milk and flour then you should not translate that into 'first eggs, second milk and finally flour'. The correct translation would be 'eggs, milk and sugar and by the way he ordered eggs first, milk second and sugar third'. This means you will end up with six statements - three on the ingredients and three on the statements on the ingredients. Do not mix them. Regarding Modification date: There is quite a difference. The modification date is a statement on a subject (the wiki page) that is stored with the subject, but without modifying it. Storing the index number of a statement the way you propose, _would_ modify the statement. So, nothing broken with the Modification date, with that index, though... Cheers, Stephan On Jun 20, 2013 7:02 PM, Yaron Koren ya...@wikiworks.com wrote: Stephan - you make some good points. As far as displaying the number/index in queries - that sounds interesting, though even a separate Has index property might not necessarily be ideal for that. If you have two or more different kinds of #subobject calls on a page, it might not work out nicely. For instance, a recipe page might have subobjects for ingredients, and then subobjects for instructions. In that case, the ingredients might have Has index values of 1-10, and then the instructions might have Has index values of 11-15. (That's how numbering used to work in SIO.) So displaying this property might just look weird. Your second point is that the hash system lets SMW only display unique subobjects. Which is true, but (a) in my experience that's not a major issue, and (b) actually, sometimes you really do want to store duplicate data. What if you have a page of test scores, and you use #subobject to store each student's name and their score, and two students happen to have the same name and the same score? (Let's say that there aren't wiki pages for each student, which would force you to disambiguate.) Duplicate data might not be an error - it might be valid data. You seem to also make the point that, because it hasn't been entered by a user, the index/number isn't truly a part of the data model, and shouldn't be stored at all. Assuming you are in fact making that point, it's a reasonable opinion, but I disagree, for two reasons: (1) the order of elements is something that users can control, and thus, it is actually implicitly part of the data model, and (2) SMW already stores a