Home Newsletter Resources

Go Back   Small Business Forum > Online Business Discussion > Search Optimization
Register FAQ Members List Calendar Search Today's Posts Mark Forums Read

Reply
 
LinkBack Thread Tools Display Modes
  #1 (permalink)  
Old 05-31-2005, 01:21 AM
Ignoramus29781
 
Posts: n/a
Google stopped indexing my wikipedia mirror

I recently integrated wikipedia with my site, using two
approaches. One is linking individual wiki pages into my algebra
modules. The links in those pages point to the real wikipedia, but
javascript in them wuold direct the reader who clicks on them, to my
site. This lets users who click on these links, to stay within one
algebra module. I am not concerned about that case.

The second is that I have a full crosslinked wikipedia mirror under
one particular directory. I already get quite a few google directed
hits to various pages there. However, I keep track of how many
wikipedia pages googlebot is visiting, and it has not visited even a
fraction of what is out there.

At some point I fed google several big files with links to all
articles, which it promptly read and even followed some (I think). At
some point later, the visits stopped. The pages that google did read,
are still visitable through search engines.

I am talking about tens or hundreds of thousands of articles. Google
indexed mere thousands.

Full credit is given to wikipedia and I fully follow the GFDL license.

My question is, is there something that prevents google from following
up on this. Any ideas will be appreciated. The pages with links
contain 5,000 links each, there are 289 such pages and the master list.

i


--
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #2 (permalink)  
Old 05-31-2005, 01:23 AM
davidof
 
Posts: n/a
Re: Google stopped indexing my wikipedia mirror

Ignoramus29781 wrote:
> I recently integrated wikipedia with my site


uh huh, along with a bizzilion other folk. The problem is that Google
does not really want to bother with all these Wikipedia mirrors so runs
duplicate page algorithms. Maybe the indexer decided that your pages
were just duplicates of other content and told the googlebot not to
spider those links anymore. That would make sense to me as spidering and
indexing pages that are of no benefit to searchers just wastes Google's
resources.
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 05-31-2005, 01:24 AM
Ignoramus31500
 
Posts: n/a
Re: Google stopped indexing my wikipedia mirror

On Sun, 08 May 2005 09:42:25 +0200, davidof <david.george@g-dumpthisbit-mail.com> wrote:
> Ignoramus29781 wrote:
>> I recently integrated wikipedia with my site

>
> uh huh, along with a bizzilion other folk. The problem is that Google
> does not really want to bother with all these Wikipedia mirrors so runs
> duplicate page algorithms. Maybe the indexer decided that your pages
> were just duplicates of other content and told the googlebot not to
> spider those links anymore. That would make sense to me as spidering and
> indexing pages that are of no benefit to searchers just wastes Google's
> resources.


Surely, that makes sense. Possibly, it will happen sooner or later.

I began to mirror individual wikipedia pages for math related content,
to complement the math pages that I already had. Then I decided to
mirror wikipedia in a SE friendly way, since all pieces were already
in place.

In any case, googlebot is back and is busy indexing my pages. It
varies by day.

I fully comply with the wikipedia license, giving credit, referring to
GFDL etc.

i
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Reply



Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


All times are GMT -4. The time now is 02:23 AM.


Powered by vBulletin Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
Content Relevant URLs by vBSEO 3.0.0 RC5
smallbusinessforum.com

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30