Re: passing more parameters leads to duplicate content

2010-05-10 Thread WebbedIT
> I find you insistence that in no normal circumstances could massl have
> a point (which is how I understand your replies to this thread) - a
> bit weird.

Firstly, I certainly did not intend to say massl could not have a
point under any circumstance. So apologies if I came across as
dismissive in any way.

> have you never clicked on a link in a forum/email/whatever
> and got part of the message appended to the url? here, try this one
> E.g.
>
> The book's full of useful info, it's all 
> herehttp://book.cakephp.org/view/875/x1-3-Collection/andguess what my
> space bar's a bit dicky.

I joined the discussion as I believed that Google would only pick up
on and score down a duplicate content link if one of its robots found
such a physical link.  Up until looking at the above example I could
not think of a situation where someone would accidentally add a
trailing slash and extra words to the end of a link.  But I can see
how that is entirely possible in the context of forums etc. and forget
that browsers/people can add trailing slashes to links (I'm not in the
habit of doing so myself, but should stop thinking people all use the
web the way I do).

> I find your opinion in this thread and massive overreaction to my
> example (seek out? I typed your nick in google and appended example
> texts to the first hit) contradictory/hypocritical. The main reason
> for using your own site was:
>
> 1) to demonstrate it's easily possible (out of curiosity why isn't
> your site using the stock pages controller?)
> 2) to demonstrate that you and google aren't the people with the
> 'control' to make the problem arise - which to me is the most
> important reason to consider defending against it.

To reply to a thread raising the issue of duplicate content and then
to see the next reply displaying links to my site, which created this
exact problem, made your reply seem very personal.  I firmly believe
others would have been surprised to see their websites used as an
example too.

Why didn't you reply to John Anderson or MilesJ in the same manner? I
know the answer to this is probably that I was the last person to post
when you replied, but John had taken a very similar line to me that if
the link did not exist in your site then Google would not index it and
Miles had questioned the depth of the problem too!

My initial reaction was a polite request for you to remove those
links, your response was a mocking no. By this point I was more
focused on the personal aspect of your replies than whatever point you
were attempting to make by posting the links in the first place.

1) Your demonstration, at the time, only proved that if you were
deliberately trying to type a link wrong you could and that Cake would
accept it, but I had not said no-one could type erroneous links, just
that people are likely to copy and paste them.  However I can now see
how typo's could occur and that someone could mount a malicious attack
against a site and that you should cover such eventualities

1a) I modified the pages controller to work in a CMS type manner, I am
guessing I have removed something that would help in this situation?

2) Yip, typos happen and malicious attacks are malicious and
untraceable to mere mortals.  My moral compass was overriding reality
when I mentioned reporting people.

> I'm sorry you feel you've been wronged it was just an example to
> clearly demonstrate the opposite of your message. Given you emailed me
> offlist (contrary to popular belief I don't sleep plugged into the
> internet) I see you really do feel strongly about it -

Yip, I did feel wronged as you picked on me alone when others has made
the same/similar points and to create examples of the problem being
discussed using my website can only be taken as directed towards me
individually, which has to be the definition of personal.

> which further
> confuses me as to why you insist massl's thread is to address a
> problem that doesn't exist (despite him requesting that whether it
> exists not be discussed)

I was wrong to say "I do no (sic) see how this is a problem", but I
was not the only one to say that.

massls request for the discussion to not go in this direction was
missed being at the bottom of his second post.  I picked up on this
later and have apologised, but our online differences aside (a side
effect of non verbal communication), I do feel as though it's been a
useful discussion to have.  I am here to learn and I'm therefore happy
to be corrected by anyone.

> However, to bring things back to the original point - IMO defending
> against malicious users isn't the main reason you'd consider the (same
> content - different url) problem.
>
> Here's some example urls that an app can easily generate, one way or
> another, and they'll all contain the same content:
>
> 1) You define some vanity/i18n routes, consider an action in a plugin
> controller:http://example.com/action/http://example.com/plugin/action/*http://example.com/plugin/plugin/action/http://ex

Re: passing more parameters leads to duplicate content

2010-05-09 Thread AD7six


On May 7, 3:46 pm, WebbedIT  wrote:
> > I guess you've never heard of black hat seo techniques.
>
> Yip, certainly have
>
> > Report them for what - most of the time we're talking about typos
>
> A typo wouldn't lead to this issue, a typo would lead to your domain
> or main parameters being wrong which would result in CakePHP kicking
> out some sort of error.  I can't see how anyone would accidentally
> type an extra forward slash and then add extra params

I find you insistence that in no normal circumstances could massl have
a point (which is how I understand your replies to this thread) - a
bit weird. have you never clicked on a link in a forum/email/whatever
and got part of the message appended to the url? here, try this one
E.g.

The book's full of useful info, it's all here
http://book.cakephp.org/view/875/x1-3-Collection/and guess what my
space bar's a bit dicky.

>
> > you should /consider/ malicious users (and good luck with reporting
> > "the site" - the links will be comment spam - all across the web -
> > pointing at you, you can't report "the internet").
>
> I bow down to this point, if someone mounts a campaign against a site
> then they are likely to use bots and comment spam so the links would
> not be from the site of the person conducting the campaign.  However,
> you could block the offending sites accepting comment spam and contact
> the owner of the site, but this would quickly become tedious as it's
> not going to stop the malicious idiot sending the comment spam.
>
> However I still think the chances of someone resorting to such action
> against your average site would be few and far between, but agree that
> this is something that people should be aware of.  I wouldn't want
> newbies or anti-cakephp peeps to see this thread and cause widespread
> hysteria that CakePHP is crap as any site running on it is going to
> under attack from such premeditated malicious attacks.
>
> > > Now can you remove them links from the net please!!!
>
> > Sure, they're scheduled to be removed in 2020
>
> No seriously, remove them ... you did not need to search out and use
> my real site to prove your point

I find your opinion in this thread and massive overreaction to my
example (seek out? I typed your nick in google and appended example
texts to the first hit) contradictory/hypocritical. The main reason
for using your own site was:

1) to demonstrate it's easily possible (out of curiosity why isn't
your site using the stock pages controller?)
2) to demonstrate that you and google aren't the people with the
'control' to make the problem arise - which to me is the most
important reason to consider defending against it.

> and whilst I respect your wealth of
> Cake development knowledge and what you give to the community I think
> that was a pretty crappy thing to do just to prove a point.  

I'm sorry you feel you've been wronged it was just an example to
clearly demonstrate the opposite of your message. Given you emailed me
offlist (contrary to popular belief I don't sleep plugged into the
internet) I see you really do feel strongly about it - which further
confuses me as to why you insist massl's thread is to address a
problem that doesn't exist (despite him requesting that whether it
exists not be discussed)

However, to bring things back to the original point - IMO defending
against malicious users isn't the main reason you'd consider the (same
content - different url) problem.

Here's some example urls that an app can easily generate, one way or
another, and they'll all contain the same content:

1) You define some vanity/i18n routes, consider an action in a plugin
controller:
http://example.com/action/
http://example.com/plugin/action/ *
http://example.com/plugin/plugin/action/
http://example.com/plugin/plugin/action/something <- massl's concern

2) You use pagination
http://example.com/controller/
http://example.com/controller/index/
http://example.com/controller/index/sort:id/
http://example.com/controller/index/sort:created/
http://example.com/controller/index/page:1/
http://example.com/controller/index/page:1/sort:asc/
http://example.com/controller/index/sort:asc/page:1/
etc.

If you insist that it's impossible for someone to maliciously or
accidentally append something to a url which your code will ignore -
you should at least consider how your own code is generating links.
It's possible with some forethought to forgo the entire problem - by
using a canonical metatag as lucca suggested and/or by using a
component to apply some intelligent 301 redirect logic for you.

anyway, hth,

AD
* with latest 1.3 it doesn't automatically do/understand this any more

Check out the new CakePHP Questions site http://cakeqs.org and help others with 
their CakePHP related questions.

You received this message because you are subscribed to the Google Groups 
"CakePHP" group.
To post to this group, send email to cake-php@googlegroups.com
To unsubscribe from this group, send email to
cake-php+unsubscr...@googl

Re: passing more parameters leads to duplicate content

2010-05-07 Thread WebbedIT
> I also don't want to discuss about whether it's an issue or not

@massl: Only just caught the above line in your 2nd post, so sorry to
take the topic in that direction, but I think it is a useful thread
for others to read as any site could fall foul of such malicious
attacks, although in my humble opinion it's likely to only be larger
more successful sites that people would bother to target.

Paul.

Check out the new CakePHP Questions site http://cakeqs.org and help others with 
their CakePHP related questions.

You received this message because you are subscribed to the Google Groups 
"CakePHP" group.
To post to this group, send email to cake-php@googlegroups.com
To unsubscribe from this group, send email to
cake-php+unsubscr...@googlegroups.com For more options, visit this group at 
http://groups.google.com/group/cake-php?hl=en


Re: passing more parameters leads to duplicate content

2010-05-07 Thread WebbedIT
> I guess you've never heard of black hat seo techniques.

Yip, certainly have

> Report them for what - most of the time we're talking about typos

A typo wouldn't lead to this issue, a typo would lead to your domain
or main parameters being wrong which would result in CakePHP kicking
out some sort of error.  I can't see how anyone would accidentally
type an extra forward slash and then add extra params

> you should /consider/ malicious users (and good luck with reporting
> "the site" - the links will be comment spam - all across the web -
> pointing at you, you can't report "the internet").

I bow down to this point, if someone mounts a campaign against a site
then they are likely to use bots and comment spam so the links would
not be from the site of the person conducting the campaign.  However,
you could block the offending sites accepting comment spam and contact
the owner of the site, but this would quickly become tedious as it's
not going to stop the malicious idiot sending the comment spam.

However I still think the chances of someone resorting to such action
against your average site would be few and far between, but agree that
this is something that people should be aware of.  I wouldn't want
newbies or anti-cakephp peeps to see this thread and cause widespread
hysteria that CakePHP is crap as any site running on it is going to
under attack from such premeditated malicious attacks.

> > Now can you remove them links from the net please!!!
>
> Sure, they're scheduled to be removed in 2020

No seriously, remove them ... you did not need to search out and use
my real site to prove your point and whilst I respect your wealth of
Cake development knowledge and what you give to the community I think
that was a pretty crappy thing to do just to prove a point.  The least
you can do is remove that post as it will not disrupt the flow of this
thread.

Paul

Check out the new CakePHP Questions site http://cakeqs.org and help others with 
their CakePHP related questions.

You received this message because you are subscribed to the Google Groups 
"CakePHP" group.
To post to this group, send email to cake-php@googlegroups.com
To unsubscribe from this group, send email to
cake-php+unsubscr...@googlegroups.com For more options, visit this group at 
http://groups.google.com/group/cake-php?hl=en


Re: passing more parameters leads to duplicate content

2010-05-07 Thread AD7six


On May 7, 10:32 am, WebbedIT  wrote:
> @AD7six: I'm not sure why that was necessary as I implied in my reply
> that the only way Google would find incorrect links to index was if
> someone maliciously posted erroneous links, but that would have to be
> a very rare situation to be

I guess you've never heard of black hat seo techniques.

> in and you could easily find out which
> site they had come from and report them to Google!

Report them for what - most of the time we're talking about typos -
you should /consider/ malicious users (and good luck with reporting
"the site" - the links will be comment spam - all across the web -
pointing at you, you can't report "the internet").

Anyway here look what I found in the book:

http://book.cakephp.org/view/4/Hot-chicks

It doesn't have to be to guard against malicious users though - the
reason the book does those redirects is to auto correct typos in
titles and keep the seo sugar.

(and: it's just an example I know you can 'beat' the redirect logic in
the book and that it stopped working for i18n content)

>
> Now can you remove them links from the net please!!!

Sure, they're scheduled to be removed in 2020. in the mean time I'd
suggest reconsidering the topic of this thread with your new found
insight  ;)

hth,

AD

Check out the new CakePHP Questions site http://cakeqs.org and help others with 
their CakePHP related questions.

You received this message because you are subscribed to the Google Groups 
"CakePHP" group.
To post to this group, send email to cake-php@googlegroups.com
To unsubscribe from this group, send email to
cake-php+unsubscr...@googlegroups.com For more options, visit this group at 
http://groups.google.com/group/cake-php?hl=en


Re: passing more parameters leads to duplicate content

2010-05-07 Thread WebbedIT
@AD7six: I'm not sure why that was necessary as I implied in my reply
that the only way Google would find incorrect links to index was if
someone maliciously posted erroneous links, but that would have to be
a very rare situation to be in and you could easily find out which
site they had come from and report them to Google!

Now can you remove them links from the net please!!!

Check out the new CakePHP Questions site http://cakeqs.org and help others with 
their CakePHP related questions.

You received this message because you are subscribed to the Google Groups 
"CakePHP" group.
To post to this group, send email to cake-php@googlegroups.com
To unsubscribe from this group, send email to
cake-php+unsubscr...@googlegroups.com For more options, visit this group at 
http://groups.google.com/group/cake-php?hl=en


Re: passing more parameters leads to duplicate content

2010-05-06 Thread AD7six


On May 5, 2:19 pm, massl  wrote:
> On 5 Mai, 14:01, John Andersen  wrote:
>
> > I just wonder, when the search engine goes through your site, then
> > your site does not make the duplicate URLs (I assume), so the issue
> > should not arise!
>
> > If I am wrong, please clarify :)
>
> Yes sure, it's not a ultimative huge problem. But it could be that you
> develop the website and change the parameter count. Or someone falsely
> posts a link somewhere with more parameters then needed...and so on.
> I also don't want to discuss about whether it's an issue or not. It
> would just be great if someone knows a solution for this (no code
> needed, just theoretical).

here's a bit of both:

http://github.com/AD7six/mi/blob/master/controllers/components/seo.php#L109

The intention was for any link like:
/controller/view/1/wrong-slug -> 301 redirect -> /controller/view/1/
right-slug
or e.g.
/controller/view/1/right-slug -> 301 redirect -> /prettyroute-1-
rightslug (matching routes definition)

or e.g.
/controller/index/1/2/3/4 -> 301 redirect -> /controller/index/1
(action only has 1 paramter)

or e.g.
/controller/index/some:namedarg/page:2 -> 301 redirect -> /controller/
index/page:2/some:namedarg (order named args alphabetically so there's
only one 'real' url)

And all combinations thereof to result in only 1 url per real page.

It's unlikely to work perfectly - it's a while since I looked at it
and I don't use it atm. If it doesn't work and you can't figure it out
use it only for reference and write something that solves your use
case. I'd also recommend/consider Lucca's suggestion since it bypasses
the whole problem/process.

hth,

AD

Check out the new CakePHP Questions site http://cakeqs.org and help others with 
their CakePHP related questions.

You received this message because you are subscribed to the Google Groups 
"CakePHP" group.
To post to this group, send email to cake-php@googlegroups.com
To unsubscribe from this group, send email to
cake-php+unsubscr...@googlegroups.com For more options, visit this group at 
http://groups.google.com/group/cake-php?hl=en


Re: passing more parameters leads to duplicate content

2010-05-06 Thread AD7six


On May 6, 8:47 am, WebbedIT  wrote:
> I do no see how this is a problem as neither you or a search engine
> would add extra unneeded parameters to a link, and if the hard coded
> links do not exist in your pages then Google cannot index them
>
> Also anyone linking to your pages is just going to copy/paste an URL,
> they're not going to add in extra unneeded parameters when doing so.

Here you go:

http://www.webbedit.co.uk/pages/about/chickensoup
http://www.webbedit.co.uk/pages/about/malicioususers
http://www.webbedit.co.uk/pages/about/howgoogleworks
http://www.webbedit.co.uk/pages/about/imaginebbotgeneratingthousandsofthesetopoisonyourgooglerank

AD

Check out the new CakePHP Questions site http://cakeqs.org and help others with 
their CakePHP related questions.

You received this message because you are subscribed to the Google Groups 
"CakePHP" group.
To post to this group, send email to cake-php@googlegroups.com
To unsubscribe from this group, send email to
cake-php+unsubscr...@googlegroups.com For more options, visit this group at 
http://groups.google.com/group/cake-php?hl=en


Re: passing more parameters leads to duplicate content

2010-05-05 Thread WebbedIT
I do no see how this is a problem as neither you or a search engine
would add extra unneeded parameters to a link, and if the hard coded
links do not exist in your pages then Google cannot index them

Also anyone linking to your pages is just going to copy/paste an URL,
they're not going to add in extra unneeded parameters when doing so.

Paul

Check out the new CakePHP Questions site http://cakeqs.org and help others with 
their CakePHP related questions.

You received this message because you are subscribed to the Google Groups 
"CakePHP" group.
To post to this group, send email to cake-php@googlegroups.com
To unsubscribe from this group, send email to
cake-php+unsubscr...@googlegroups.com For more options, visit this group at 
http://groups.google.com/group/cake-php?hl=en


Re: passing more parameters leads to duplicate content

2010-05-05 Thread Miles J
But how are they duplicates?

/users/profile/1/
/users/profile/2/
/blog/read/some-slug/123/
/blog/read/slug/12356/
/image/view/15

None of those are duplicates.

Why would you pass arguments that ultimately don't decide how the
action renders?

On May 5, 8:17 am, Lucca Mordente  wrote:
> What about insert a canonical meta tag inside pages that are prone to
> have duplicated urls?
>
> The canonical meta tells the search engine that the right url for a
> page is that url you put as canonical
> This way, even if each duplicated page has the same url, you will not
> have indexing problems
>
> Cheers!
>
> On May 5, 9:19 am, massl  wrote:
>
>
>
> > On 5 Mai, 14:01, John Andersen  wrote:
>
> > > I just wonder, when the search engine goes through your site, then
> > > your site does not make the duplicate URLs (I assume), so the issue
> > > should not arise!
>
> > > If I am wrong, please clarify :)
>
> > Yes sure, it's not a ultimative huge problem. But it could be that you
> > develop the website and change the parameter count. Or someone falsely
> > posts a link somewhere with more parameters then needed...and so on.
> > I also don't want to discuss about whether it's an issue or not. It
> > would just be great if someone knows a solution for this (no code
> > needed, just theoretical).
>
> > massl
>
> > Check out the new CakePHP Questions sitehttp://cakeqs.organdhelp others 
> > with their CakePHP related questions.
>
> > You received this message because you are subscribed to the Google Groups 
> > "CakePHP" group.
> > To post to this group, send email to cake-php@googlegroups.com
> > To unsubscribe from this group, send email to
> > cake-php+unsubscr...@googlegroups.com For more options, visit this group 
> > athttp://groups.google.com/group/cake-php?hl=en
>
> Check out the new CakePHP Questions sitehttp://cakeqs.organd help others with 
> their CakePHP related questions.
>
> You received this message because you are subscribed to the Google Groups 
> "CakePHP" group.
> To post to this group, send email to cake-php@googlegroups.com
> To unsubscribe from this group, send email to
> cake-php+unsubscr...@googlegroups.com For more options, visit this group 
> athttp://groups.google.com/group/cake-php?hl=en

Check out the new CakePHP Questions site http://cakeqs.org and help others with 
their CakePHP related questions.

You received this message because you are subscribed to the Google Groups 
"CakePHP" group.
To post to this group, send email to cake-php@googlegroups.com
To unsubscribe from this group, send email to
cake-php+unsubscr...@googlegroups.com For more options, visit this group at 
http://groups.google.com/group/cake-php?hl=en


Re: passing more parameters leads to duplicate content

2010-05-05 Thread Lucca Mordente
What about insert a canonical meta tag inside pages that are prone to
have duplicated urls?

The canonical meta tells the search engine that the right url for a
page is that url you put as canonical
This way, even if each duplicated page has the same url, you will not
have indexing problems

Cheers!

On May 5, 9:19 am, massl  wrote:
> On 5 Mai, 14:01, John Andersen  wrote:
>
> > I just wonder, when the search engine goes through your site, then
> > your site does not make the duplicate URLs (I assume), so the issue
> > should not arise!
>
> > If I am wrong, please clarify :)
>
> Yes sure, it's not a ultimative huge problem. But it could be that you
> develop the website and change the parameter count. Or someone falsely
> posts a link somewhere with more parameters then needed...and so on.
> I also don't want to discuss about whether it's an issue or not. It
> would just be great if someone knows a solution for this (no code
> needed, just theoretical).
>
> massl
>
> Check out the new CakePHP Questions sitehttp://cakeqs.organd help others with 
> their CakePHP related questions.
>
> You received this message because you are subscribed to the Google Groups 
> "CakePHP" group.
> To post to this group, send email to cake-php@googlegroups.com
> To unsubscribe from this group, send email to
> cake-php+unsubscr...@googlegroups.com For more options, visit this group 
> athttp://groups.google.com/group/cake-php?hl=en

Check out the new CakePHP Questions site http://cakeqs.org and help others with 
their CakePHP related questions.

You received this message because you are subscribed to the Google Groups 
"CakePHP" group.
To post to this group, send email to cake-php@googlegroups.com
To unsubscribe from this group, send email to
cake-php+unsubscr...@googlegroups.com For more options, visit this group at 
http://groups.google.com/group/cake-php?hl=en


Re: passing more parameters leads to duplicate content

2010-05-05 Thread massl
On 5 Mai, 14:01, John Andersen  wrote:
> I just wonder, when the search engine goes through your site, then
> your site does not make the duplicate URLs (I assume), so the issue
> should not arise!
>
> If I am wrong, please clarify :)

Yes sure, it's not a ultimative huge problem. But it could be that you
develop the website and change the parameter count. Or someone falsely
posts a link somewhere with more parameters then needed...and so on.
I also don't want to discuss about whether it's an issue or not. It
would just be great if someone knows a solution for this (no code
needed, just theoretical).

massl

Check out the new CakePHP Questions site http://cakeqs.org and help others with 
their CakePHP related questions.

You received this message because you are subscribed to the Google Groups 
"CakePHP" group.
To post to this group, send email to cake-php@googlegroups.com
To unsubscribe from this group, send email to
cake-php+unsubscr...@googlegroups.com For more options, visit this group at 
http://groups.google.com/group/cake-php?hl=en


Re: passing more parameters leads to duplicate content

2010-05-05 Thread John Andersen
I just wonder, when the search engine goes through your site, then
your site does not make the duplicate URLs (I assume), so the issue
should not arise!

If I am wrong, please clarify :)
Enjoy,
   John

On May 5, 2:33 pm, massl  wrote:
> Hi,
>
> I currently have a SEO problem with CakePHP.
>
> For example you have an "users"-controller with an action "register"
> that is called by example.com/users/register. You can now add more
> arguments to the URL e.g. example.com/users/register/my/duplicate/
> content.
>
> That's very bad for SEO because you can have unlimited duplicate
> content. One solution (that isn't very good) is that you check the
> parameter count in every function with func_num_args() and then 301
> redirect to the correct URL. But that are at least two more lines in
> every function and you manually have to enter the parameter count.
>
> Does someone maybe have a better idea to solve this issue?
>
> massl
>
> Check out the new CakePHP Questions sitehttp://cakeqs.organd help others with 
> their CakePHP related questions.
>
> You received this message because you are subscribed to the Google Groups 
> "CakePHP" group.
> To post to this group, send email to cake-php@googlegroups.com
> To unsubscribe from this group, send email to
> cake-php+unsubscr...@googlegroups.com For more options, visit this group 
> athttp://groups.google.com/group/cake-php?hl=en

Check out the new CakePHP Questions site http://cakeqs.org and help others with 
their CakePHP related questions.

You received this message because you are subscribed to the Google Groups 
"CakePHP" group.
To post to this group, send email to cake-php@googlegroups.com
To unsubscribe from this group, send email to
cake-php+unsubscr...@googlegroups.com For more options, visit this group at 
http://groups.google.com/group/cake-php?hl=en


passing more parameters leads to duplicate content

2010-05-05 Thread massl
Hi,

I currently have a SEO problem with CakePHP.

For example you have an "users"-controller with an action "register"
that is called by example.com/users/register. You can now add more
arguments to the URL e.g. example.com/users/register/my/duplicate/
content.

That's very bad for SEO because you can have unlimited duplicate
content. One solution (that isn't very good) is that you check the
parameter count in every function with func_num_args() and then 301
redirect to the correct URL. But that are at least two more lines in
every function and you manually have to enter the parameter count.

Does someone maybe have a better idea to solve this issue?

massl

Check out the new CakePHP Questions site http://cakeqs.org and help others with 
their CakePHP related questions.

You received this message because you are subscribed to the Google Groups 
"CakePHP" group.
To post to this group, send email to cake-php@googlegroups.com
To unsubscribe from this group, send email to
cake-php+unsubscr...@googlegroups.com For more options, visit this group at 
http://groups.google.com/group/cake-php?hl=en