AUTHOR GENDER IDENTIFICATION IN FORENSIC LINGUISTICS

AUTHORGENDER IDENTIFICATION IN FORENSIC LINGUISTICS

Thereis increased need to identify the authors of written material toenhance security. Rise in cybercrime through publication of onlinecontent that compromises parameters of individual rights or statesecurity in anonymity makes it harder to map and narrow down onperpetrators. However, with modern forensics, there is increased zealin identification of individuals based on different categories suchas age, race, religion and gender.

Firstly,an apt introduction of the study will open the dissertation withbackground information that regards how the research topic has beenarrived at and how it will be of help. The section will also includelogical issues of the research carried out. The introduction willserve as a base to build an intellectual opinion on the targetresearch for the reader. They will understand the need to haveresearch carried out in the area and further still the necessity ofcreating the software.

Suchresearch will require detailed information on linguistics, especiallythat which is related to how different genders use languagedifferently. The literature review will, therefore, discuss factorsthat affect language choice and use by genders, especially the femalegender. It will also review the existing gender identificationsoftware and try to link the intended software with the platformthese software uses. Information that will build the literaturereview will be collected from several sources across the internet andin publications that are related. There will be information based onblog and online content that will offer a base for the research.Similarly, books on forensic linguistics and software developmentwill be used to show how the content of the linguistic study and thepractice in software development can be merged to create workingsoftware.

Collectingprimary data for such research would lead to biased information andthe data collected would, therefore, not produce credible results foranalysis. As such, the data to be used in the research is secondarydata. The data will be collected from known sources and relatedstudies. The information will be based mainly on the literaturereview. Additional data for testing will be sourced from blogs andonline sources that contain related information and emails. Thereliability of such data however is questionable, and it will be usedwith the assumption in the study that the sources are true andcredible.

Inaddition, the data collected will be presented graphically anddiscussed with rational conclusions and recommendations being madefrom the result arrived at. This will form the basis of the design ofthe software as recommendations will help in gauging what to beincluded and what not to be.

Theinformation collected, however, is factual as they are in thescientific domain of linguistics that deals with theories. Indeed,the research will focus on the linguistic theories that examine howcertain words and phrases are used in language by different genders.Understanding the theories and the methods of word choice andsentence construction that are used will be key in developing acombination of factors that are a paradigm for language use. Thiswill be basic in developing software that can identify the gender ofan author of a publication within a favourable ratio of probability.

Inconclusion, using the fundamental principles of carrying out credibleresearch and analysis and presentation of data, the development of asoftware program that aptly identifies the gender of an author isfeasible.