Dear PMario, your code

<$list filter=[tag[dict-AR]indexes] variable=index>
<$set name="subf" value="[getindex<index>split[,]trim[]match<keyword>]">

    <$list filter="[tag[dict-AR]filter<subf>]" variable="entry">
             <<entry>> <br/>
    </$list>
</$set>
</$list>

COULD do the trick, but:

1. it crashes TW with an 'internal Javascript error' red alert. I also 
tried myself to write something similar, and every time I tried to use a 
variable inside a subfilter, like <index> here, it made TW crash, which 
made me think it was syntactically wrong, but now that you also use this 
syntax I will try to troubleshoot the problem;

2. since the 'match' test must not be done against a single keyword but 
against a *set* of keywords, in order to get rid of duplicate lines of 
output the 'match' test in the  subfilter must be re-written accordingly. 
Lacking the flexibility of true variables, the only workaround I could come 
up with to avoid duplicate lines of output is to perform the test on ALL 
the keywords of the set for every property of every data tiddler, I mean to 
say that each property must be tested for matching against EVERY keyword in 
a single filter's pass. If you bother reading why, you can find a thorough 
clarification why futher below.

Anyway, the two crucial points still to be worked out are: 

a) writing a subfilter's 'match' test that testes against not just one 
single keyword but against a SET of keywords in a single filter's pass and
b) making TW accept variables inside subfilters without crashing

I will work on point a), while for point b) there is little I can do but 
maybe trying to update TW to the latest version (the one I currently use is 
5.1.23) or maybe update the javascript engine of my browser...

Now for a very verbose description of how my dictionary is structured and 
the trick to avoid duplicate lines of output:

my dictionary is a cross-language dictionary (say for example French to 
English -these are not the true languages though) where each entry is a 
data tiddler whose properties are groups of words translating the entry's 
meaning like for example:

permettre
  IT-01: allow
  IT-02: enable, permit
  IT-03: let, license

sentir 
  IT-01: feel
  IT-02: smell, sense, sniff, stink

toucher
  IT-01: touch
  IT-02: affect
  IT-03: feel
  IT-04: hit, receive, contact

croire
  IT-01: believe, think
  IT-02: suppose, imagine
  IT-03: feel
  IT-04: consider

As you might have noticed, for every french entry there are different 
groups of english words translating it, because (as it is common in many 
dictionaries) some translations' meanings are more 'nearly related' to each 
other (almost interchangeably) than others and so they are grouped together 
in a single row. I could have gone for a different structure, having -say- 
just one dictionary tiddler whose indexes were french words and whose 
properties were groups of english words translating the indexes, but I 
decided to exclude this solution because it lacks the rich data structure 
that individual tiddlers' fields offer and which I use to store the 
meta-data of each entry (grammatical category, infinitive, participles, 
preposition it goes with, etc.). Moreover, having the entries stored as 
individual tiddlers allows for the best exploitation possible of TW rich 
search-and-filter features, which I use to study and learn the target 
language with a lot of custom code. So the wiki is not only a language 
dictionary, but actually a platform built on top of it to enable the study 
of the language.

One of the customized pieces of code I care of most is this search (call it 
apropos if you like) for french equivalent of some english keywords (to 
continue with the example of the French-English dictionary), because one of 
the exercises I do is write a text in English and try to translate it on 
the fly to French, so having some code that quickly spits out one or more 
french equivalents for some of the english words whose translation I'm in 
doubt of is invaluable. It just takes for me to 'tag' each word with an 
hyperlink (hyperlink that doesn't point to an actual tiddler, is just a 
convenient way of tagging words in a text for the code to easily collect 
them through links[] operator) and my code, fed with the exercise's text, 
does all the burden of searching through the dictionary for me: super!

Say for example that I want to search for the french word(s) for 'feel', my 
code would hopefully output this:

    sentir:  feel; smell, sense, sniff, stink
    croire: believe, think; suppose, imagine; feel; consider

In my output, since I want to render each entry in a single text row, I use 
a semicolon ';' to join all the different groups of words (rows) together, 
while keeping the comma ',' as a separator for words belonging to the same 
group.

So far so good, but when I say 'nice format' I also mean avoiding 
duplicates. Duplicates can occur when you search for more than an english 
word at the same time. Say that you search for words 'feel' and 'smell' 
together. Without a means to avoid duplicates, the code returns

    sentir:  feel; smell, sense, sniff, stink
    sentir:  feel; smell, sense, sniff, stink
    croire: believe, think; suppose, imagine; feel; consider

the entry 'sentir' is duplicated because the search had two hits: one for 
english word 'feel' and one for 'smell'. How do I get rid of such 
duplicates that dirty up my output with spurious rows? This question 
introduces the more general subject of how to incrementally append elements 
to a list and do some post-processing when the list is done, and this could 
be a good subject for a -badly needed IMHO- 'coding patterns in Tiddlywiki' 
book, specifically aimed at programmers. 

Coming down to my more specifical issue, I was wondering that by design 
there are no two entries (data tiddlers) with the same name (title); 
moreover, most likely there aren't two same 'rows' (two identical groups of 
related words) for the same dictionary entry or in TW jargon there aren't 
two properties with the same value for the same data tiddler, so duplicates 
can only occur when you have two or more hits for the same 'row' of the 
same entry, because the words in that row match more than one keyword as 
for 'sentir' above matched by both keywords 'feel' and 'smell'. To avoid 
duplicates, then, it just takes for every 'row' of every entry to do the 
'match' test against ALL the search keywords in just one single filter's 
pass, provided that a proper (sub)filter to do the job can be found, 
because in this case duplicates are automatically discarded by the very 
filter processing logic.
Take for example the -serialized in one row- entry 'sentir'

    sentir:  feel; smell, sense, sniff, stink

if we test it against the WHOLE keywords set {'feel', 'smell'} in one 
filter's pass, the filter's output will be just one single instance of the 
entry's title, 'sentir', regardless of having had two hits while running 
the filter.

Hope I have made myself understandable enough, and not too much boring.

Thanks a ton for your very deeply committed help.

CG

On Saturday, November 20, 2021 at 7:32:17 PM UTC+2 PMario wrote:

> I'm not 100% sure but I think the following code may do the trick.
>
> ```
> <$list filter=[tag[dict-AR]indexes] variable=index>
> <$set name="subf" value="[getindex*<index>*split[,]trim[]match<keyword>]">
>
>     <$list filter="[tag[dict-AR]filter<subf>]" variable="entry">
>              <<entry>> <br/>
>     </$list>
> </$set>
> </$list>
> ```
>
> I didn't test the code, since I don't have a wiki with test data. 
> Could you provide a minimal test case wiki, that contains some data 
> tiddler with your structure and show the structure, how the output should 
> look like. 
>
> I can put something together myself with the data from the 2nd post, but I 
> can't be sure, if it really looks like your test data. 
>
> There is 1 more question. Why did you use data-tiddlers? Is your data 
> collected with TW, or does it come from a 3rd party app?
>
> -mario
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"TiddlyWiki" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tiddlywiki+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tiddlywiki/4c3214a5-1b1d-47a4-aa62-14b832b1f75fn%40googlegroups.com.

Reply via email to