Re: [TYPO3-english] mnoGoSearch indexing

2014-03-07 Thread Pero Peric

On 7.3.2014. 8:21, Dmitry Dulepov wrote:

Hi!

Pero Peric wrote:

ah TYPO3, TYPO3.. 5 millions of features, fluid, space ship enterprises
and when it comes to elementary modules like search.. indexed search not
good for many pages, mnoGoSearch indexes something i didn't tell him to
index, solr probably good but ofcourse java + tomcat bla bla.. What am i
left with.. maybe this ke_search but based on experience something is
probably wrong there too :-) And we are talking about module that is
almost on every site. ah..

Maybe this people that are putting so much effort in things like fluid
and space ship enterprises could come down to earth a bit and make some
built in fast working search to replace indexed search, but it seems
that's not so fancy.. ah..


For mnoGoSearch I had to uses mnoGoSearch'es own indexer, which is not
highly customizable. It is a crawler and indexer all-in-one. The problem
you described comes from the way mnoGoSearch works.

For now I would suggest Solr as a much better alternative. I hate java 
products that use java but I was able to set up the whole thing in about
two hours without any prior experience. It is really not that difficult
if you have a little *nix, xml and console skills.


Dmitry, belive me, in my enviroment and by enviroment i mean on 39287 
different projects, different jobs and poor organization you want things 
simple as simple they can be so installing java/tomcat on top of all 
would just be too much. But ok. thx. for advices, i will leave 
mnoGoSearch as it is or i will try ke_search if i find time. Maybe for 
some other and bigger sites i'll try solr one time (when something else 
will come up :-)


Regards.


___
TYPO3-english mailing list
TYPO3-english@lists.typo3.org
http://lists.typo3.org/cgi-bin/mailman/listinfo/typo3-english


Re: [TYPO3-english] mnoGoSearch indexing

2014-03-06 Thread Pero Peric

On 5.3.2014. 14:26, Philipp Gampe wrote:

Hi Pero,

Pero Peric wrote:


Maybe this people that are putting so much effort in things like fluid
and space ship enterprises could come down to earth a bit and make some
built in fast working search to replace indexed search, but it seems
that's not so fancy.. ah.


It is not that easy to write a good search engine. On most smaller websites,
the search is broken or very slow.
On top of that, TYPO3 CMS has a non-trivial content model, which makes
searching a very complex topic. Our flexibility bides us in ass here. We
know this hurts, but we cannot really do anything about it.

Why? Well, there is not predefined way on how the content is rendered on the
website. You can use TYPO3 CMS for a fully AJAX website that spits out JSON
content or you can create a one page site out of many pages. You can create
a traditional site or you can use it for completely non-web publishing
processes. The issue here is, that you cannot know (or calculate) how a
certain record will be rendered on a website. It might just be places as
content object (on a traditional website) or it might not even be rendered
at all. It might show up on a different page or even on all pages, because
it is directly references via TypoScript (RECORD).
It might not even show up at all, because some part of the website is hidden
because not link to it is rendered.

Essentially you cannot know programmatically, how a record will show up on
which URL. And that is what you need to know to write a search.
So how does indexed search solves this? It solves it by introducing markers
that wrap text that should show up in search and analyzes only fully
rendered cached! pages. Why only cached pages? Because the website can be
completely different for logged-in users or user with a certain browser or
visitors from a certain country or a few docent other conditions. Cached
pages must have a finite amount of conditions that can be taken into account
for searches as well.

How does solr solves this? They allow you to create rules for every kind of
record. This results in a very long list of rules and still needs custom
code for complex cases.

Therefore you can either index mostly static pages witch is almost trivial
(indexed_search) or you can use a big solution that needs a complex (and
expensive) setup.
Of course there are solutions in between like ke_search, but they will not
cover every situation.

After all it boils down to what record will show up where. Therefore a
search is as custom to a site as the template used to render those records.
Nobody bother yet enough to write a search engine that is as flexible as
templating approach and I am very sure that if someone did, a lot of people
would complain that it is sooo complicated to setup.

The reason why there is not superdooper search engine is, because nobody has
a high quality solution and the core team will not accept another half-
backed, half-working pseudo solution.
The difference to other CMS is that the content goes to a unknown number of
transformation before it is rendered on the website. Therefore it is not
enough to know what content is on what page to create a working search.


Philipp, thx. for explanation. Web sites i'm working on mostly use 
text/img content type and tt_news so i don't infact need something ultra 
complex. I was in search of replacement for indexed search because it 
was not working good for a 1000+ pages web site so i found mnoGoSearch. 
It works fine but i don't like this links indexing outside search 
markers. Maybe i will try ke_search, will see.


Regards.


___
TYPO3-english mailing list
TYPO3-english@lists.typo3.org
http://lists.typo3.org/cgi-bin/mailman/listinfo/typo3-english


Re: [TYPO3-english] mnoGoSearch indexing

2014-03-06 Thread Dmitry Dulepov

Hi!

Pero Peric wrote:

ah TYPO3, TYPO3.. 5 millions of features, fluid, space ship enterprises
and when it comes to elementary modules like search.. indexed search not
good for many pages, mnoGoSearch indexes something i didn't tell him to
index, solr probably good but ofcourse java + tomcat bla bla.. What am i
left with.. maybe this ke_search but based on experience something is
probably wrong there too :-) And we are talking about module that is
almost on every site. ah..

Maybe this people that are putting so much effort in things like fluid
and space ship enterprises could come down to earth a bit and make some
built in fast working search to replace indexed search, but it seems
that's not so fancy.. ah..


For mnoGoSearch I had to uses mnoGoSearch'es own indexer, which is not 
highly customizable. It is a crawler and indexer all-in-one. The problem 
you described comes from the way mnoGoSearch works.


For now I would suggest Solr as a much better alternative. I hate java  
products that use java but I was able to set up the whole thing in about 
two hours without any prior experience. It is really not that difficult if 
you have a little *nix, xml and console skills.


--
Dmitry Dulepov

Today is a good day to have a good day.
___
TYPO3-english mailing list
TYPO3-english@lists.typo3.org
http://lists.typo3.org/cgi-bin/mailman/listinfo/typo3-english


Re: [TYPO3-english] mnoGoSearch indexing

2014-03-05 Thread Dmitry Dulepov

Hi!

Pero Peric wrote:

Hm. I understand it crawls pages by links but why it index links? I see
this as two separate processes.


mnoGoSearch doesn't :)

--
Dmitry Dulepov

Today is a good day to have a good day.
___
TYPO3-english mailing list
TYPO3-english@lists.typo3.org
http://lists.typo3.org/cgi-bin/mailman/listinfo/typo3-english


Re: [TYPO3-english] mnoGoSearch indexing

2014-03-05 Thread Pero Peric

On 5.3.2014. 13:07, Dmitry Dulepov wrote:

Hi!

Pero Peric wrote:

Hm. I understand it crawls pages by links but why it index links? I see
this as two separate processes.


mnoGoSearch doesn't :)


ah TYPO3, TYPO3.. 5 millions of features, fluid, space ship enterprises 
and when it comes to elementary modules like search.. indexed search not 
good for many pages, mnoGoSearch indexes something i didn't tell him to 
index, solr probably good but ofcourse java + tomcat bla bla.. What am i 
left with.. maybe this ke_search but based on experience something is 
probably wrong there too :-) And we are talking about module that is 
almost on every site. ah..


Maybe this people that are putting so much effort in things like fluid 
and space ship enterprises could come down to earth a bit and make some 
built in fast working search to replace indexed search, but it seems 
that's not so fancy.. ah..


regards.

___
TYPO3-english mailing list
TYPO3-english@lists.typo3.org
http://lists.typo3.org/cgi-bin/mailman/listinfo/typo3-english


Re: [TYPO3-english] mnoGoSearch indexing

2014-03-03 Thread Dmitry Dulepov

Hi!

Pero Peric wrote:

so, does this mean that links are kept although they are outside search
markers?


mnoGoSearch works by crawling links. So if you only keep the text inside 
search markers, only pages referred from the main content will be indexed. 
This means that if you exclude the menu, nothing will be found.


So links outside of the content are indexed.

--
Dmitry Dulepov

Today is a good day to have a good day.
___
TYPO3-english mailing list
TYPO3-english@lists.typo3.org
http://lists.typo3.org/cgi-bin/mailman/listinfo/typo3-english