Technology Tips -blogger
Showing posts with label blogger. Show all posts
Showing posts with label blogger. Show all posts

Sunday, October 19, 2014

As you know xml sitemap is a file based on xml tags giving url information of your site to web crawler like googlebot or bing bot etc.
If your site is hosted on web server and ideally you can create xml file using any
 Online xml sitemap generator and you could upload that file in .htaccess file  and that’s it you are done. You are ready with xml sitemap which you have created online and now you have told web crawler that hey my entire list of urls are there uploaded once you crawl browse them and display them to audience based on keywords.
But problem occurs when you are not using complete web server or let’s say you are using free hosting space where you have limited access to web server and actually you can’t upload any .htacess file. There is another way to tell web crawler about your index pages that is atom feed which is supported in blogger.

Submitting xml sitemap to blogger using Google webmaster only:

Ideally you can work with atom.xml or actually you can create that file by specifying its name atom.xml in google webmaster in order to create complete sitemap for blogger– so you don’t need to create any extra xml sitemap from other online sources etc- means where you will see date, time and modified xml tags in it.
So now we are assuming you have created xml sitemap type atom inside your blogger blog and fortunately you are dealing with custom domain as well.

People go somewhere if they want to generate xml sitemap blogger – you don’t need to go somewhere to get xml sitemap and do something with blogger. It’s a process of generating xml site map type atom using Google web master tool.
You are going to configure and create xml sitemap and configure it inside blogger using robot.txt which is used for customized entries-stopping Google to crawl some url.
Why we are using custom robot.txt file- because we don’t want some url to be crawled by Google or we don’t want to keep them in Google index. First you will generate xml sitemap using Google webmaster tool by entering simple atom.xml parameter in Google webmaster tool.

XML SiteMap for Blogger with custom urls:                      
Ideally your output could be one of the following: blog hosted on blogger
What you have done actually you have added atom.xml parameter using webmaster interface- in the sitemap section and added further parameters for all your blog posts to be indexed like in  the case above- these parameters can be used to index 500 url entries if you have 1000 or more posts entries in your blog so you can add them too – just define start index 501 and so on.
Now you have to test this sitemap too, you will do it in the same section in webmaster tool and you can define some posts that you don’t want to be indexed by Google and copy the entire chunk from there and go back to blogger.
Once you define and test your sitemap in webmaster robot.txt section so you will end up with the following chunk:

User-agent: Mediapartners-Google
User-agent: *
Disallow: /search
Allow: /

It's clearly mentioned above that your xml sitemap added as xml with 500 posts on blogger. Now you can add some more post url that you don’t want to be indexed in Disallow section and copy that above snipped and past in the blogger.
You are ready with your xml sitemap for blogger without visiting any other site -only webmaster tool and go to blogger dashboard point to setting and search preferences and past that snipped into this section called custom robot.txt. You are telling Google this is my full xml sitemap crawl and index site and there are some custom url entries.
Hope you enjoyed learning here.

Friday, March 15, 2013

 If you come across with a situation when you see duplicate URL entries of your site pages on Google  search engine and you do not know how to fix them so here we have a rundown as to how you can fix them, basically these duplicate URLs could be from  your session url or archive pages.
In other words you can call them duplicate content url that exits in your own site in the form of your site url and date archive format and sometimes your site url and session entries and sometime the same url with different parameter in url showing the same content with different url entries in search engine you can track them by just typing a command on Google:  and enter then you will see your site url entries with different parameters showing same pages.

If you are managing your blog with blogspot/blogger and creating post and side by side you see there are number of archive pages are being indexed against those new posts so search engine probably do not like these duplicate archive pages.
Blogger gives you option to remove those archive pages from Google index, basically you need to tell Google do not index them because these are the archive pages which are duplicate URL.

Fix the duplicate content URL entries from blogger:
If you are using old interface of blogspot/blogger then go to setting>> archive and set no archive and if you are using new interface of blogspot /blogger then you need to write some code and telling  Google not to index these archive pages which is pretty straight forward approach. you can write this code with the old interface as well - it's just a matter of code which is mentioned below - just copy code snippet and you past it in template.

Sample duplicate archive pages entries:
Some of sample pages are your url ending with date and archives or some special character like question mark etc.

To fix them just point your location to blogger template html section and find head> tag and add following code after head tag.

<b:if cond='data:blog.pageType == &quot;archive&quot;'>
<meta content='noindex,noarchive' name='robots'/>

This can happen with your own website where you see duplicate url of your session URL
Once you fix this issue then Google will show you exact  number of published post in its index.

Once Google bot will crawl your site then it will show you exact number of html of web pages or post no extra pages from your own site having same content.
you will see boost in ranking after this classic SEO fix.

Some other attributes you can use for noindex:
Following sample will prevent all cached linked to appear on MSN search result:
<meta name="msnbot" content="noindex">